llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 04:30:12 +00:00

Author	SHA1	Message	Date
David Majnemer	8db493c4e1	InstCombine: Ensure select condition types are identical before merging Selection conditions may be vectors or scalars. Make sure InstCombine doesn't indiscriminately assume that a select which is value dependent on another select have identical select condition types. This fixes PR22773. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231156 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-03 22:40:36 +00:00
Nadav Rotem	10faa1b211	Teach ComputeNumSignBits about signed divisions. http://reviews.llvm.org/D8028 rdar://20023136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231140 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-03 21:39:02 +00:00
Duncan P. N. Exon Smith	b056aa798d	DebugInfo: Move new hierarchy into place Move the specialized metadata nodes for the new debug info hierarchy into place, finishing off PR22464. I've done bootstraps (and all that) and I'm confident this commit is NFC as far as DWARF output is concerned. Let me know if I'm wrong :). The code changes are fairly mechanical: - Bumped the "Debug Info Version". - `DIBuilder` now creates the appropriate subclass of `MDNode`. - Subclasses of DIDescriptor now expect to hold their "MD" counterparts (e.g., `DIBasicType` expects `MDBasicType`). - Deleted a ton of dead code in `AsmWriter.cpp` and `DebugInfo.cpp` for printing comments. - Big update to LangRef to describe the nodes in the new hierarchy. Feel free to make it better. Testcase changes are enormous. There's an accompanying clang commit on its way. If you have out-of-tree debug info testcases, I just broke your build. - `upgrade-specialized-nodes.sh` is attached to PR22564. I used it to update all the IR testcases. - Unfortunately I failed to find way to script the updates to CHECK lines, so I updated all of these by hand. This was fairly painful, since the old CHECKs are difficult to reason about. That's one of the benefits of the new hierarchy. This work isn't quite finished, BTW. The `DIDescriptor` subclasses are almost empty wrappers, but not quite: they still have loose casting checks (see the `RETURN_FROM_RAW()` macro). Once they're completely gutted, I'll rename the "MD" classes to "DI" and kill the wrappers. I also expect to make a few schema changes now that it's easier to reason about everything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231082 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-03 17:24:31 +00:00
Peter Collingbourne	27821d7200	LowerBitSets: Use byte arrays instead of bit sets to represent in-memory bit sets. By loading from indexed offsets into a byte array and applying a mask, a program can test bits from the bit set with a relatively short instruction sequence. For example, suppose we have 15 bit sets to lay out: A (16 bits), B (15 bits), C (14 bits), D (13 bits), E (12 bits), F (11 bits), G (10 bits), H (9 bits), I (7 bits), J (6 bits), K (5 bits), L (4 bits), M (3 bits), N (2 bits), O (1 bit) These bits can be laid out in a 16-byte array like this: Byte Offset 0123456789ABCDEF Bit 7 HHHHHHHHHIIIIIII 6 GGGGGGGGGGJJJJJJ 5 FFFFFFFFFFFKKKKK 4 EEEEEEEEEEEELLLL 3 DDDDDDDDDDDDDMMM 2 CCCCCCCCCCCCCCNN 1 BBBBBBBBBBBBBBBO 0 AAAAAAAAAAAAAAAA For example, to test bit X of A, we evaluate ((bits[X] & 1) != 0), or to test bit X of I, we evaluate ((bits[9 + X] & 0x80) != 0). This can be done in 1-2 machine instructions on x86, or 4-6 instructions on ARM. This uses the LPT multiprocessor scheduling algorithm to lay out the bits efficiently. Saves ~450KB of instructions in a recent build of Chromium. Differential Revision: http://reviews.llvm.org/D7954 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231043 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-03 00:49:28 +00:00
Benjamin Kramer	51a833938f	LoopIdiom: Give globals for memset_pattern16 private linkage. There's really no reason to have them have entries in the symbol table anymore. Old versions of ld64 had some bugs in this area but those have been fixed long ago. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231041 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-03 00:17:09 +00:00
Sanjoy Das	caee94bbb4	Revert some changes that were made to fix PR20680. This re-lands change r230921. r230921 was reverted because it broke a clang test; a checkin fixing the clang test will be commited shortly. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit `1ade0f0faa` (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit `c0f2b8b528` (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@231018 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-02 21:41:07 +00:00
NAKAMURA Takumi	6cad61163a	Revert r230921, "Revert some changes that were made to fix PR20680.", for now. It caused a failure on clang/test/Misc/backend-optimization-failure.cpp . git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230929 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-02 01:14:03 +00:00
Sanjoy Das	008dd56706	Revert some changes that were made to fix PR20680. Summary: As far as I can tell, the real bug causing the issue was fixed in r230533. SCEVExpander should mark an increment operation as nuw or nsw only if it can prove that the operation does not overflow. There shouldn't be any situation where we have to do something different because of no-wrap flags generated by SCEVExpander. Revert "IndVarSimplify: Allow LFTR to fire more often" This reverts commit `1ade0f0faa` (SVN: 222213). Revert "IndVarSimplify: Don't let LFTR compare against a poison value" This reverts commit `c0f2b8b528` (SVN: 217102). Reviewers: majnemer, atrick, spatel Differential Revision: http://reviews.llvm.org/D7979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230921 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-01 23:36:26 +00:00
Duncan P. N. Exon Smith	17549804c1	DebugInfo: Convert DW_OP_piece => DW_OP_bit_piece r228631 stopped using `DW_OP_piece` inside `DIExpression`s in the IR, but it apparently missed updating these testcases. Caught by verifier checks for `MDExpression` while working on moving the new hierarchy into place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230882 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-28 23:57:16 +00:00
Duncan P. N. Exon Smith	80f65ca7ee	Fix line endings on Transforms/Inline/inline_dbg_declare.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230870 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-28 21:38:32 +00:00
Benjamin Kramer	6b7603962a	TRE: Just erase dead BBs and tweak the iteration loop not to increment the deleted BB iterator. Leaving empty blocks around just opens up a can of bugs like PR22704. Deleting them early also slightly simplifies code. Thanks to Sanjay for the IR test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230856 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-28 16:47:27 +00:00
Philip Reames	af690c9cd3	[RewriteStatepointsForGC] Fix another order of iteration bug It turns out the naming of inserted phis and selects is sensative to the order in which two sets are iterated. We need to nail this down to avoid non-deterministic output and possible test failures. The modified test is the one I first noticed something odd in. The change is making it more strict to report the error. With the test change, but without the code change, the test fails roughly 1 in 5. With the code change, I've run ~30 runs without error. Long term, the right fix here is to adjust the naming scheme. I'm checking in this hack to avoid any possible non-determinism in the tests over the weekend. HJust because I only noticed one case doesn't mean it's actually the only case. I hope to get to the right change Monday. std->llvm data structure changes bugfix change #3 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230835 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-28 01:52:09 +00:00
Philip Reames	a3f59e44cd	[RewriteStatepointsForGC] Add tests for the base pointer identification algorithm These tests cover the 'base object' identification and rewritting portion of RewriteStatepointsForGC. These aren't completely exhaustive, but they've proven to be reasonable effective over time at finding regressions. In the process of porting these tests over, I found my first "cleanup per llvm code style standards" bug. We were relying on the order of iteration when testing the base pointers found for a derived pointer. When we switched from std::set to DenseSet, this stopped being a safe assumption. I'm suspecting I'm going to find more of those. In particular, I'm now really wondering about the main iteration loop for this algorithm. I need to go take a closer look at the assumptions there. I'm not really happy with the fact these are testing what is essentially debug output (i.e. enabled via command line flags). Suggestions for how to structure this better are very welcome. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230818 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-28 00:20:48 +00:00
David Blaikie	7c9c6ed761	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230794 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-27 21:17:42 +00:00
David Blaikie	198d8baafb	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230786 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-27 19:29:02 +00:00
Sanjoy Das	0e93eb80f3	IRCE: add a test case for r230619. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230680 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-26 20:14:32 +00:00
Hal Finkel	532af6859f	[InstCombine/PowerPC] Convert aligned QPX load/store intrinsics into loads/stores InstCombine has long had logic to convert aligned Altivec load/store intrinsics into regular loads and stores. This mirrors that functionality for QPX vector load/store intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230660 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-26 18:56:03 +00:00
Hal Finkel	9f9a6fd453	[InstCombine] Add a test for altivec load/store intrinsic simplification InstCombine has logic to convert aligned Altivec load/store intrinsics into regular loads and stores. Unfortunately, there seems to be no regression test covering this behavior. Adding one... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230632 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-26 14:22:41 +00:00
Sanjoy Das	21b2edfeae	IRCE: generalize to handle loops with decreasing induction variables. IRCE can now split the iteration space for loops like: for (i = n; i >= 0; i--) a[i + k] = 42; // bounds check on access git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230618 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-26 08:19:31 +00:00
Ramkumar Ramachandra	e10581ac39	PlaceSafepoints: use IRBuilder helpers Use the IRBuilder helpers for gc.statepoint and gc.result, instead of coding the construction by hand. Note that the gc.statepoint IRBuilder handles only CallInst, not InvokeInst; retain that part of hand-coding. Differential Revision: http://reviews.llvm.org/D7518 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230591 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-26 00:35:56 +00:00
Sanjay Patel	8e3ef7f186	only propagate equality comparisons of FP values that we are certain are non-zero This is a follow-on to r227491 which tightens the check for propagating FP values. If a non-constant value happens to be a zero, we would hit the same bug as before. Bug noted and patch suggested by Eli Friedman. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230564 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 22:46:08 +00:00
JF Bastien	6fec24744f	InstCombine: extract instead of shuffle when performing vector/array type punning Summary: SROA generates code that isn't quite as easy to optimize and contains unusual-sized shuffles, but that code is generally correct. As discussed in D7487 the right place to clean things up is InstCombine, which will pick up the type-punning pattern and transform it into a more obvious bitcast+extractelement, while leaving the other patterns SROA encounters as-is. Test Plan: make check Reviewers: jvoung, chandlerc Subscribers: llvm-commits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230560 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 22:30:51 +00:00
Peter Collingbourne	d63e5ad9c5	LowerBitSets: Align referenced globals. This change aligns globals to the next highest power of 2 bytes, up to a maximum of 128. This makes it more likely that we will be able to compress bit sets with a greater alignment. In many more cases, we can now take advantage of a new optimization also introduced in this patch that removes bit set checks if the bit set is all ones. The 128 byte maximum was found to provide the best tradeoff between instruction overhead and data overhead in a recent build of Chromium. It allows us to remove ~2.4MB of instructions at the cost of ~250KB of data. Differential Revision: http://reviews.llvm.org/D7873 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230540 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 20:42:41 +00:00
Sanjoy Das	a0a0b40aa3	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap (The change was landed in r230280 and caused the regression PR22674. This version contains a fix and a test-case for PR22674). When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. Differential Revision: http://reviews.llvm.org/D7778 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230533 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 20:02:59 +00:00
Sanjay Patel	d2c64e2df9	Fix really obscure bug in CannotBeNegativeZero() (PR22688) With a diabolically crafted test case, we could recurse through this code and return true instead of false. The larger engineering crime is the use of magic numbers. Added FIXME comments for those. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230515 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 18:00:15 +00:00
Charles Davis	fba7e30f0f	[IC] Turn non-null MD on pointer loads to range MD on integer loads. Summary: This change fixes the FIXME that you recently added when you committed (a modified version of) my patch. When `InstCombine` combines a load and store of an pointer to those of an equivalently-sized integer, it currently drops any `!nonnull` metadata that might be present. This change replaces `!nonnull` metadata with `!range !{ 1, -1 }` metadata instead. Reviewers: chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7621 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-25 05:10:25 +00:00
Peter Collingbourne	0bf03cb473	LowerBitSets: Introduce global layout builder. The builder is based on a layout algorithm that tries to keep members of small bit sets together. The new layout compresses Chromium's bit sets to around 15% of their original size. Differential Revision: http://reviews.llvm.org/D7796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230394 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-24 23:17:02 +00:00
Hans Wennborg	b499b73e30	Revert r230280: "Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap" This caused PR22674, failing this assert: Instructions.h:2281: llvm::Value* llvm::PHINode::getOperand(unsigned int) const: Assertion `i_nocapture < OperandTraits<PHINode>::operands(this) && "getOperand() out of range!"' failed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230341 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-24 16:19:29 +00:00
Sanjoy Das	f922d9cfe4	New instcombine rule: max(~a,~b) -> ~min(a, b) This case is interesting because ScalarEvolutionExpander lowers min(a, b) as ~max(~a,~b). I think the profitability heuristics can be made more clever/aggressive, but this is a start. Differential Revision: http://reviews.llvm.org/D7821 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230285 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-24 00:08:41 +00:00
Sanjoy Das	8d16a81c33	Bugfix: SCEVExpander incorrectly marks increment operations as no-wrap When emitting the increment operation, SCEVExpander marks the operation as nuw or nsw based on the flags on the preincrement SCEV. This is incorrect because, for instance, it is possible that {-6,+,1} is <nuw> while {-6,+,1}+1 = {-5,+,1} is not. This change teaches SCEV to mark the increment as nuw/nsw only if it can explicitly prove that the increment operation won't overflow. Apart from the attached test case, another (more realistic) manifestation of the bug can be seen in Transforms/IndVarSimplify/pr20680.ll. NOTE: this change was landed with an incorrect commit message in rL230275 and was reverted for that reason in rL230279. This commit message is the correct one. Differential Revision: http://reviews.llvm.org/D7778 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230280 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-23 23:22:58 +00:00
Sanjoy Das	69048edf8a	Revert 230275. 230275 got committed with an incorrect commit message due to a mixup on my side. Will re-land in a few moments with the correct commit message. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230279 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-23 23:13:22 +00:00
Sanjoy Das	7ebbc8de2f	Fix bug 22641 The bug was a result of getPreStartForExtend interpreting nsw/nuw flags on an add recurrence more strongly than is legal. {S,+,X}<nsw> implies S+X is nsw only if the backedge of the loop is taken at least once. Differential Revision: http://reviews.llvm.org/D7808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230275 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-23 22:55:13 +00:00
Chad Rosier	6229219f7e	Prevent hoisting fmul from THEN/ELSE to IF if there is fmsub/fmadd opportunity. This patch adds the isProfitableToHoist API. For AArch64, we want to prevent a fmul from being hoisted in cases where it is more profitable to form a fmsub/fmadd. Phabricator Review: http://reviews.llvm.org/D7299 Patch by Lawrence Hu <lawrence@codeaurora.org> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230241 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-23 19:15:16 +00:00
Mehdi Amini	2bd4b63e7a	InstSimplify: simplify 0 / X if nnan and nsz From: Fiona Glaser <fglaser@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230238 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-23 18:30:25 +00:00
Sanjoy Das	c5e1132ac2	IRCE: use SCEVs instead of llvm::Value's for intermediate calculations. Semantically non-functional change. This gets rid of some of the SCEV -> Value -> SCEV round tripping and the Construct(SMin\|SMax)Of and MaybeSimplify helper routines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230150 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-21 22:07:32 +00:00
Philip Reames	7a62a2a5ae	[PlaceSafepoints] Adjust enablement logic to default to off and be GC configurable per GC Previously, this pass ran over every function in the Module if added to the pass order. With this change, it runs only over those with a GC attribute where the GC explicitly opts in. A GC can also choose which of entry safepoint polls, backedge safepoint polls, and call safepoints it wants. I hope to get these exposed as checks on the GCStrategy at some point, but for now, the checks are manual string comparisons. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230097 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-21 00:09:09 +00:00
Benjamin Kramer	d889ad2ab8	LoopRotate: When reconstructing loop simplify form don't split edges from indirectbrs. Yet another chapter in the endless story. While this looks like we leave the loop in a non-canonical state this replicates the logic in LoopSimplify so it doesn't diverge from the canonical form in any way. PR21968 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230058 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 20:49:25 +00:00
Peter Collingbourne	5a81e14385	Introduce bitset metadata format and bitset lowering pass. This patch introduces a new mechanism that allows IR modules to co-operatively build pointer sets corresponding to addresses within a given set of globals. One particular use case for this is to allow a C++ program to efficiently verify (at each call site) that a vtable pointer is in the set of valid vtable pointers for the class or its derived classes. One way of doing this is for a toolchain component to build, for each class, a bit set that maps to the memory region allocated for the vtables, such that each 1 bit in the bit set maps to a valid vtable for that class, and lay out the vtables next to each other, to minimize the total size of the bit sets. The patch introduces a metadata format for representing pointer sets, an '@llvm.bitset.test' intrinsic and an LTO lowering pass that lays out the globals and builds the bitsets, and documents the new feature. Differential Revision: http://reviews.llvm.org/D7288 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230054 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 20:30:47 +00:00
Philip Reames	ef6e26ea1f	Bugfix for 229954 Before calling Function::getGC to test for enablement, we need to make sure there's actually a GC at all via Function::hasGC. Otherwise, we'd crash on functions without a GC. Thankfully, this only mattered if you manually scheduled the pass, but still, oops. :( git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230040 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 18:56:14 +00:00
Hal Finkel	5ecf528fc2	[InstCombine] Remove unnecessary variable indexing into single-element arrays This change addresses a deficiency pointed out in PR22629. To copy from the bug report: [from the bug report] Consider this code: int f(int x) { int a[] = {12}; return a[x]; } GCC knows to optimize this to movl $12, %eax ret The code generated by recent Clang at -O3 is: movslq %edi, %rax movl .L_ZZ1fiE1a(,%rax,4), %eax retq .L_ZZ1fiE1a: .long 12 # 0xc [end from the bug report] This definitely seems worth fixing. I've also seen this kind of code before (as the base case of generic vector wrapper templates with one element). The general idea is to look at the GEP feeding a load or a store, which has some variable as its first non-zero index, and determine if that index must be zero (or else an out-of-bounds access would occur). We can do this for allocas and globals with constant initializers where we know the maximum size of the underlying object. When we find such a GEP, we create a new one for the memory access with that first variable index replaced with a constant zero. Even if we can't eliminate the memory access (and sometimes we can't), it is still useful because it removes unnecessary indexing calculations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229959 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 03:05:53 +00:00
Philip Reames	e807289468	Adjust enablement of RewriteStatepointsForGC When back merging the changes in 229945 I noticed that I forgot to mark the test cases with the appropriate GC. We want the rewriting to be off by default (even when manually added to the pass order), not on-by default. To keep the current test working, mark them as using the statepoint-example GC and whitelist that GC. Longer term, we need a better selection mechanism here for both actual usage and testing. As I migrate more tests to the in tree version of this pass, I will probably need to update the enable/disable logic as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229954 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 02:34:49 +00:00
Philip Reames	673db11fdb	Add a pass for constructing gc.statepoint sequences w/explicit relocations This patch consists of a single pass whose only purpose is to visit previous inserted gc.statepoints which do not have gc.relocates inserted yet, and insert them. This can be used either immediately after IR generation to perform 'early safepoint insertion' or late in the pass order to perform 'late insertion'. This patch is setting the stage for work to continue in tree. In particular, there are known naming and style violations in the current patch. I'll try to get those resolved over the next week or so. As I touch each area to make style changes, I need to make sure we have adequate testing in place. As part of the cleanup, I will be cleaning up a collection of test cases we have out of tree and submitting them upstream. The tests included in this change are very basic and mostly to provide examples of usage. The pass has several main subproblems it needs to address: - First, it has identify any live pointers. In the current code, the use of address spaces to distinguish pointers to GC managed objects is hard coded, but this will become parametrizable in the near future. Note that the current change doesn't actually contain a useful liveness analysis. It was seperated into a followup change as the code wasn't ready to be shared. Instead, the current implementation just considers any dominating def of appropriate pointer type to be live. - Second, it has to identify base pointers for each live pointer. This is a fairly straight forward data flow algorithm. - Third, the information in the previous steps is used to actually introduce rewrites. Rather than trying to do this by hand, we simply re-purpose the code behind Mem2Reg to do this for us. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229945 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 01:06:44 +00:00
Michael Gottesman	391935a017	[objc-arc-contract] We can not move retains over instructions which can not conservatively be proven to not decrement the retain's RCIdentity. I also cleaned up the code to make it more understandable for mere mortals. <rdar://problem/19853758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229937 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-20 00:02:49 +00:00
Ahmed Bougacha	5898fc70ec	[ARM] Re-re-apply VLD1/VST1 base-update combine. This re-applies r223862, r224198, r224203, and r224754, which were reverted in r228129 because they exposed Clang misalignment problems when self-hosting. The combine caused the crashes because we turned ISD::LOAD/STORE nodes to ARMISD::VLD1/VST1_UPD nodes. When selecting addressing modes, we were very lax for the former, and only emitted the alignment operand (as in "[r1:128]") when it was larger than the standard alignment of the memory type. However, for ARMISD nodes, we just used the MMO alignment, no matter what. In our case, we turned ISD nodes to ARMISD nodes, and this caused the alignment operands to start being emitted. And that's how we exposed alignment problems that were ignored before (but I believe would have been caught with SCTRL.A==1?). To fix this, we can just mirror the hack done for ISD nodes: only take into account the MMO alignment when the access is overaligned. Original commit message: We used to only combine intrinsics, and turn them into VLD1_UPD/VST1_UPD when the base pointer is incremented after the load/store. We can do the same thing for generic load/stores. Note that we can only combine the first load/store+adds pair in a sequence (as might be generated for a v16f32 load for instance), because other combines turn the base pointer addition chain (each computing the address of the next load, from the address of the last load) into independent additions (common base pointer + this load's offset). rdar://19717869, rdar://14062261. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229932 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-19 23:52:41 +00:00
Rafael Espindola	1b4da6c8ce	Avoid conversion to float when creating ConstantDataArray/ConstantDataVector. Patch by Raoux, Thomas F! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229864 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-19 16:08:20 +00:00
Igor Laevsky	b05b70bab5	Add few simple tests to check statepoint placement for invoke instructions. Differential Revision: http://reviews.llvm.org/D7535 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229842 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-19 11:39:04 +00:00
Chandler Carruth	a8fb39af83	[x86,sdag] Two interrelated changes to the x86 and sdag code. First, don't combine bit masking into vector shuffles (even ones the target can handle) once operation legalization has taken place. Custom legalization of vector shuffles may exist for these patterns (making the predicate return true) but that custom legalization may in some cases produce the exact bit math this matches. We only really want to handle this prior to operation legalization. However, the x86 backend, in a fit of awesome, relied on this. What it would do is mark VSELECTs as expand, which would turn them into arithmetic, which this would then match back into vector shuffles, which we would then lower properly. Amazing. Instead, the second change is to teach the x86 backend to directly form vector shuffles from VSELECT nodes with constant conditions, and to mark all of the vector types we support lowering blends as shuffles as custom VSELECT lowering. We still mark the forms which actually support variable blends as legal so that the custom lowering is bypassed, and the legal lowering can even be used by the vector shuffle legalization (yes, i know, this is confusing. but that's how the patterns are written). This makes the VSELECT lowering much more sensible, and in fact should fix a bunch of bugs with it. However, as you'll see in the test cases, right now what it does is point out the hilarious deficiency of the new vector shuffle lowering when it comes to blends. Fortunately, my very next patch fixes that. I can't submit it yet, because that patch, somewhat obviously, forms the exact and/or pattern that the DAG combine is matching here! Without this patch, teaching the vector shuffle lowering to produce the right code infloops in the DAG combiner. With this patch alone, we produce terrible code but at least lower through the right paths. With both patches, all the regressions here should be fixed, and a bunch of the improvements (like using 2 shufps with no memory loads instead of 2 andps with memory loads and an orps) will stay. Win! There is one other change worth noting here. We had hilariously wrong vectorization cost estimates for vselect because we fell through to the code path that assumed all "expand" vector operations are scalarized. However, the "expand" lowering of VSELECT is vector bit math, most definitely not scalarized. So now we go back to the correct if horribly naive cost of "1" for "not scalarized". If anyone wants to add actual modeling of shuffle costs, that would be cool, but this seems an improvement on its own. Note the removal of 16 and 32 "costs" for doing a blend. Even in SSE2 we can blend in fewer than 16 instructions. ;] Of course, we don't right now because of OMG bad code, but I'm going to fix that. Next patch. I promise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229835 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-19 10:36:19 +00:00
Sanjoy Das	6da5a456f4	Partial fix for bug 22589 Don't spend the entire iteration space in the scalar loop prologue if computing the trip count overflows. This change also gets rid of the backedge check in the prologue loop and the extra check for overflowing trip-count. Differential Revision: http://reviews.llvm.org/D7715 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229731 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-18 19:32:25 +00:00
Elena Demikhovsky	c0f91e1081	Minor fix after 229495. Removed metadata and function attributes from the test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229647 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-18 08:09:28 +00:00
Adam Nemet	47985fb7cd	[LoopAccesses] Modify test to also check symbolic strides with memchecks See the comment in the code. This is part of the patchset that converts LoopAccessAnalysis into an actual analysis pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229627 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-18 03:43:32 +00:00
Akira Hatanaka	5506e22865	[InstCombine] Do not insert a GEP instruction before a landingpad instruction. InstCombiner::visitGetElementPtrInst was using getFirstNonPHI to compute the insertion point, which caused the verifier to complain when a GEP was inserted before a landingpad instruction. This commit fixes it to use getFirstInsertionPt instead. rdar://problem/19394964 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229619 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-18 03:30:11 +00:00
Hal Finkel	8a85dee989	[BDCE] Don't forget uses of root instructions seen before the instruction itself When visiting the initial list of "root" instructions (those which must always be alive), for those that are integer-valued (such as invokes returning an integer), we mark their bits as (initially) all dead (we might, obviously, find uses of those bits later, but all bits are assumed dead until proven otherwise). Don't do so, however, if we're already seen a use of those bits by another root instruction (such as a store). Fixes a miscompile of the sanitizer unit tests on x86_64. Also, add a debug line for visiting the root instructions, and remove a debug line which tried to print instructions being removed (printing dead instructions is dangerous, and can sometimes crash). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229618 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-18 03:12:28 +00:00
Elena Demikhovsky	b70bdd9034	Fixed a bug in store sinking. The problem was in store-sink barrier check. Store sink barrier should be checked for ModRef (read-write) mode. http://llvm.org/bugs/show_bug.cgi?id=22613 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229495 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-17 13:10:05 +00:00
Hal Finkel	5b43c8551e	[BDCE] Add a bit-tracking DCE pass BDCE is a bit-tracking dead code elimination pass. It is based on ADCE (the "aggressive DCE" pass), with the added capability to track dead bits of integer valued instructions and remove those instructions when all of the bits are dead. Currently, it does not actually do this all-bits-dead removal, but rather replaces the instruction's uses with a constant zero, and lets instcombine (and the later run of ADCE) do the rest. Because we essentially get a run of ADCE "for free" while tracking the dead bits, we also do what ADCE does and removes actually-dead instructions as well (this includes instructions newly trivially dead because all bits were dead, but not all such instructions can be removed). The motivation for this is a case like: int __attribute__((const)) foo(int i); int bar(int x) { x \|= (4 & foo(5)); x \|= (8 & foo(3)); x \|= (16 & foo(2)); x \|= (32 & foo(1)); x \|= (64 & foo(0)); x \|= (128& foo(4)); return x >> 4; } As it turns out, if you order the bit-field insertions so that all of the dead ones come last, then instcombine will remove them. However, if you pick some other order (such as the one above), the fact that some of the calls to foo() are useless is not locally obvious, and we don't remove them (without this pass). I did a quick compile-time overhead check using sqlite from the test suite (Release+Asserts). BDCE took ~0.4% of the compilation time (making it about twice as expensive as ADCE). I've not looked at why yet, but we eliminate instructions due to having all-dead bits in: External/SPEC/CFP2006/447.dealII/447.dealII External/SPEC/CINT2006/400.perlbench/400.perlbench External/SPEC/CINT2006/403.gcc/403.gcc MultiSource/Applications/ClamAV/clamscan MultiSource/Benchmarks/7zip/7zip-benchmark git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-17 01:36:59 +00:00
Mehdi Amini	e97c675022	InstCombine: fold more cases of (fp_to_u/sint (u/sint_to_fp val)) Fixes radar 15486701. From: Fiona Glaser <fglaser@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229437 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-16 21:47:54 +00:00
Mehdi Amini	be55a79941	Tests: reformat sitofp.ll and use FileCheck From: Fiona Glaser <fglaser@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229436 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-16 21:47:50 +00:00
James Molloy	2a7fbb1927	[LoopReroll] Relax some assumptions a little. We won't find a root with index zero in any loop that we are able to reroll. However, we may find one in a non-rerollable loop, so bail gracefully instead of failing hard. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229406 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-16 17:02:00 +00:00
James Molloy	4b739069e4	[LoopReroll] Don't crash on dead code If a PHI has no users, don't crash; bail gracefully. This shouldn't happen often, but we can make no guarantees that previous passes didn't leave dead code around. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-16 17:01:52 +00:00
David Majnemer	6de0a12927	IR: Properly return nullptr when getAggregateElement is out-of-bounds We didn't properly handle the out-of-bounds case for ConstantAggregateZero and UndefValue. This would manifest as a crash when the constant folder was asked to fold a load of a constant global whose struct type has no operands. This fixes PR22595. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229352 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-16 04:02:09 +00:00
David Blaikie	16035d6c0c	FileCheck-ize a test to make it easier to migrate to typeless pointers git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229278 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-15 04:14:00 +00:00
David Blaikie	51e38bc096	Update a test to make it easier to migrate to untyped pointers git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-15 04:13:58 +00:00
David Blaikie	95fa98330e	Update a test to use FileCheck so it's easier to migrate to future typeless pointer changes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229276 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-15 04:13:57 +00:00
David Blaikie	7af26dce0d	Reformat test case to be easier to migrate to typeless pointers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229275 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-15 04:13:53 +00:00
Ramkumar Ramachandra	0608cec657	InstCombine: propagate deref via new addDereferenceableAttr The "dereferenceable" attribute cannot be added via .addAttribute(), since it also expects a size in bytes. AttrBuilder#addAttribute or AttributeSet#addAttribute is wrapped by classes Function, InvokeInst, and CallInst. Add corresponding wrappers to AttrBuilder#addDereferenceableAttr. Having done this, propagate the dereferenceable attribute via gc.relocate, adding a test to exercise it. Note that -datalayout is required during execution over and above -instcombine, because InstCombine only optionally requires DataLayoutPass. Differential Revision: http://reviews.llvm.org/D7510 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229265 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-14 19:37:54 +00:00
Philip Reames	d777c2c0c0	[InstCombine] When canonicalizing gep indices, prefer zext when possible If we know that the sign bit of a value being sign extended is zero, we can use a zero extension instead. This is motivated by the fact that zero extensions are generally cheaper on x86 (and most other architectures?). We already apply a similar transform in DAGCombine, this just extends that to the IR level. This comes up when we eagerly canonicalize gep indices to the width of a machine register (i64 on x86_64). To do so, we insert sign extensions (sext) to promote smaller types. Differential Revision: http://reviews.llvm.org/D7255 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229189 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-14 00:05:36 +00:00
Andrea Di Biagio	d25126faae	[InstCombine] Fix regression introduced at r227197. This patch fixes a problem I accidentally introduced in an instruction combine on select instructions added at r227197. That revision taught the instruction combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. However, the new rule added at r227197 would have produced wrong results in the case where a cttz/ctlz with flag 'is_zero_undef' cleared was follwed by a zero-extend or truncate. In that case, the folded instruction would have been inserted in a wrong location thus leaving the CFG in an inconsistent state. This patch fixes the problem and add two reproducible test cases to existing test 'InstCombine/select-cmp-cttz-ctlz.ll'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229124 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 16:33:34 +00:00
Andrea Di Biagio	59d115311a	[CodeGenPrepare] Removed duplicate logic. SimplifyCFG already knows how to speculate calls to cttz/ctlz. SimplifyCFG now knows how to speculate calls to intrinsic cttz/ctlz that are 'cheap' for the target. Therefore, some of the logic in CodeGenPrepare that was originally added at revision 224899 can now be removed. This patch is basically a no functional change. It removes the duplicated logic in CodeGenPrepare and converts all the existing target specific tests for cttz/ctlz into SimplifyCFG tests. Differential Revision: http://reviews.llvm.org/D7608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229105 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 14:15:48 +00:00
James Molloy	acbbf932e9	[SimplifyCFG] Add test for r229099 Add extra test that was accidentally not staged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229101 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 11:08:40 +00:00
Chandler Carruth	25fa343bd8	[unroll] Concede defeat and disable the unroll analyzer for now. The issues with the new unroll analyzer are more fundamental than code cleanup, algorithm, or data structure changes. I've sent an email to the original commit thread with details and a proposal for how to redesign things. I'm disabling this for now so that we don't spend time debugging issues with it in its current state. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229064 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 05:31:46 +00:00
Michael Liao	4235574ce3	[InstCombine] Fix a bug when combining `icmp` from `ptrtoint` - First, there's a crash when we try to combine that pointers into `icmp` directly by creating a `bitcast`, which is invalid if that two pointers are from different address spaces. - It's not always appropriate to cast one pointer to another if they are from different address spaces as that is not no-op cast. Instead, we only combine `icmp` from `ptrtoint` if that two pointers are of the same address space. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229063 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 04:51:26 +00:00
Chandler Carruth	a768c4a096	[IC] Fix a bug with the instcombine canonicalizing of loads and propagating of metadata. We were propagating !nonnull metadata even when the newly formed load is no longer of a pointer type. This is clearly broken and results in LLVM failing the verifier and aborting. This patch just restricts the propagation of !nonnull metadata to when we actually have a pointer type. This bug report and the initial version of this patch was provided by Charles Davis! Many thanks for finding this! We still need to add logic to round-trip the metadata correctly if we combine from pointer types to integer types and then back by using range metadata for the integer type loads. But this is the minimal and safe version of the patch, which is important so we can backport it into 3.6. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229029 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 02:30:01 +00:00
Olivier Sallenave	5f235de01b	Check interleaving without relying on debug output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229027 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 02:13:57 +00:00
Michael Zolotukhin	cd35fcecc2	Testcase for r228988. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228995 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 00:35:45 +00:00
NAKAMURA Takumi	37fc1833a8	llvm/test/Transforms/LoopVectorize/PowerPC/small-loop-rdx.ll REQUIRES +Asserts due to -debug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228989 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 00:21:34 +00:00
Olivier Sallenave	90e069dc29	Change max interleave factor to 12 for POWER7 and POWER8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228973 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 22:57:58 +00:00
Bjorn Steinbrink	53a7b568b2	Fix a crash in the assumption cache when inlining indirect function calls Summary: Instances of the AssumptionCache are per function, so we can't re-use the same AssumptionCache instance when recursing in the CallAnalyzer to analyze a different function. Instead we have to pass the AssumptionCacheTracker to the CallAnalyzer so it can get the right AssumptionCache on demand. Reviewers: hfinkel Subscribers: llvm-commits, hans Differential Revision: http://reviews.llvm.org/D7533 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228957 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 21:04:22 +00:00
Benjamin Kramer	82f9916923	Update test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228956 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 20:40:19 +00:00
Benjamin Kramer	d038a7fe67	InstCombine: Allow folding of xor into icmp by changing the predicate for vectors The loop vectorizer can create this pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228954 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 20:26:46 +00:00
Michael Zolotukhin	5513166316	Add a testcase for r228432. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228951 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 19:57:24 +00:00
James Molloy	28a123abaf	[LoopRerolling] Be more forgiving with instruction order. We can't solve the full subgraph isomorphism problem. But we can allow obvious cases, where for example two instructions of different types are out of order. Due to them having different types/opcodes, there is no ambiguity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228931 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 15:54:14 +00:00
Andrea Di Biagio	44926033f6	[TTI] Teach the cost heuristic how to query TLI to check if a zext/trunc is 'free' for the target. Now that SimplifyCFG uses TTI for the cost heuristic, we can teach BasicTTIImpl how to query TLI in order to get a more accurate cost for truncates and zero-extends. Before this patch, the basic cost heuristic in TargetTransformInfoImplCRTPBase would have conservatively returned a 'default' TCC_Basic for all zero-extends, and TCC_Free for truncates on native types. This patch improves the heuristic so that we query TLI (if available) to get more accurate answers. If TLI is available, then methods 'isZExtFree' and 'isTruncateFree' can be used to check if a zext/trunc is free for the target. Added more test cases to SimplifyCFG/X86/speculate-cttz-ctlz.ll. With this change, SimplifyCFG is now able to speculate a 'cheap' cttz/ctlz immediately followed by a free zext/trunc. Differential Revision: http://reviews.llvm.org/D7585 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228923 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 14:17:24 +00:00
Chandler Carruth	ffc97a6ae1	[slp] Fix a nasty bug in the SLP vectorizer that Joerg pointed out. Apparently some code finally started to tickle this after my canonicalization changes to instcombine. The bug stems from trying to form a vector type out of scalars that aren't compatible at all. In this example, from x86_mmx values. The code in the vectorizer that checks for reasonable types whas checking for aggregates or vectors, but there are lots of other types that should just never reach the vectorizer. Debugging this was made more confusing by the lie in an assert in VectorType::get() -- it isn't that the types are primitive. The types must be integer, pointer, or floating point types. No other types are allowed. I've improved the assert and added a helper to the vectorizer to handle the element type validity checks. It now re-uses the VectorType static function and then further excludes weird target-specific types that we probably shouldn't be touching here (x86_fp80 and ppc_fp128). Neither of these are really reachable anyways (neither 80-bit nor 128-bit things will get vectorized) but it seems better to just eagerly exclude such nonesense. I've added a test case, but while it definitely covers two of the paths through this code there may be more paths that would benefit from test coverage. I'm not familiar enough with the SLP vectorizer to synthesize test cases for all of these, but was able to update the code itself by inspection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228899 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 02:30:56 +00:00
Tim Northover	7f75841a73	DeadArgElim: aggregate Return assessment properly. I mistakenly thought the liveness of each "RetVal(F, i)" depended only on F. It actually depends on the index too, which means we need to be careful about how the results are combined before return. In particular if a single Use returns Live, that counts for the entire object, at the granularity we're considering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228885 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 23:13:11 +00:00
Mehdi Amini	23af697ae6	Reassociate: cannot negate a INT_MIN value Summary: When trying to canonicalize negative constants out of multiplication expressions, we need to check that the constant is not INT_MIN which cannot be negated. Reviewers: mcrosier Reviewed By: mcrosier Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7286 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228872 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 19:54:44 +00:00
Andrea Di Biagio	f033db57e9	[TTI] Improved cost heuristic for cttz/ctlz calls. This patch is a follow-up of r228826 (see code-review: D7506). Now that SimplifyCFG uses TargetTransformInfo for cost analysis, we have to fix the cost heuristic for intrinsic calls to cttz/ctlz. This patch defines method 'getIntrinsicCost' in BasicTTIImpl: now, BasicTTIImpl queries TLI to check if a call to cttz/ctlz is cheap for the target. Added test cases in Transforms/SimplifyCFG/X86 to verify that on x86, SimplifyCFG only speculates a call to cttz/ctlz if it is cheap. Differential Revision: http://reviews.llvm.org/D7554 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228829 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 14:22:18 +00:00
James Molloy	4de471dd0a	[SimplifyCFG] Swap to using TargetTransformInfo for cost analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228826 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 12:15:41 +00:00
James Molloy	caba7561ae	[LoopReroll] Introduce the concept of DAGRootSets. A DAGRootSet models an induction variable being used in a rerollable loop. For example: x[i3+0] = y1 x[i3+1] = y2 x[i3+2] = y3 Base instruction -> i3 +---+----+ / \| \ ST[y1] +1 +2 <-- Roots \| \| ST[y2] ST[y3] There may be multiple DAGRootSets, for example: x[i2+0] = ... (1) x[i2+1] = ... (1) x[i2+4] = ... (2) x[i2+5] = ... (2) x[(i+1234)2+5678] = ... (3) x[(i+1234)2+5679] = ... (3) This concept is similar to the "Scale" member used previously, but allows multiple independent sets of roots based off the same induction variable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228821 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 09:19:47 +00:00
Reid Kleckner	7c5e0c9851	Fix invalid LLVM IR in PruneEH tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228786 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 02:06:47 +00:00
Reid Kleckner	690248bf52	Don't promote asynch EH invokes of nounwind functions to calls If the landingpad of the invoke is using a personality function that catches asynch exceptions, then it can catch a trap. Also add some landingpads to invalid LLVM IR test cases that lack them. Over-the-shoulder reviewed by David Majnemer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228782 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 01:23:16 +00:00
David Majnemer	78d0638594	EarlyCSE: Add check lines for test added in r228760 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228761 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 23:11:02 +00:00
David Majnemer	0f8bd667a1	EarlyCSE: It isn't safe to CSE across synchronization boundaries This fixes PR22514. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228760 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 23:09:43 +00:00
Tim Northover	b613f8842e	DeadArgElim: arguments affect all returned sub-values by default. Unless we meet an insertvalue on a path from some value to a return, that value will be live if any of the return's components are live, so all of those components must be added to the MaybeLiveUses. Previously we were deleting arguments if sub-value 0 turned out to be dead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228731 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 19:49:18 +00:00
Michael Zolotukhin	261a3a361b	Add a test case for new unrolling heuristics. THe heuristics were added in r228265 and r228434. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228713 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 17:54:54 +00:00
Chandler Carruth	3e77df419d	Revert r228556: InstCombine: propagate nonNull through assume This commit isn't using the correct context, and is transfoming calls that are operands to loads rather than calls that are operands to an icmp feeding into an assume. I've replied on the original review thread with a very reduced test case and some thoughts on how to rework this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228677 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 08:07:32 +00:00
Ramkumar Ramachandra	69a5c89128	PlaceSafepoints: modernize gc.result.* -> gc.result Differential Revision: http://reviews.llvm.org/D7516 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228625 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 23:00:40 +00:00
Philip Reames	d3f3d5f0d7	Introduce more tests for PlaceSafepoints These tests the two optimizations for backedge insertion currently implemented and the split backedge flag which is currently off by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228617 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 22:10:15 +00:00
Philip Reames	2eace6ebc5	Minor test cleanup a) add gc attribute b) remove unused param git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228612 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 21:50:31 +00:00
Philip Reames	c016fa420f	Add basic tests for PlaceSafepoints This is just adding really simple tests which should have been part of the original submission. When doing so, I discovered that I'd mistakenly removed required pieces when preparing the patch for upstream submission. I fixed two such bugs in this submission. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228610 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 21:48:05 +00:00
Tim Northover	968ed6a5f0	DeadArgElim: fix mismatch in accounting of array return types. Some parts of DeadArgElim were only considering the individual fields of StructTypes separately, but others (where insertvalue & extractvalue instructions occur) also looked into ArrayTypes. This one is an actual bug; the mismatch can lead to an argument being considered used by a return sub-value that isn't being tracked (and hence is dead by default). It then gets incorrectly eliminated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228559 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 01:21:00 +00:00
Tim Northover	c4af8c9467	DeadArgElim: assess uses of entire return value aggregate. Previously, a non-extractvalue use of an aggregate return value meant the entire return was considered live (the algorithm gave up entirely). This was correct, but conservative. It's better to actually look at that Use, making the analysis results apply to all sub-values under consideration. E.g. %val = call { i32, i32 } @whatever() [...] ret { i32, i32 } %val The return is using the entire aggregate (sub-values 0 and 1). We can still simplify @whatever if we can prove that this return is itself unused. Also unifies the logic slightly between aggregate and non-aggregate cases.. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228558 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 01:20:53 +00:00
Ramkumar Ramachandra	5439a80dca	InstCombine: propagate nonNull through assume Make assume (load (call\|invoke) != null) set nonNull return attribute for the call and invoke. Also include tests. Differential Revision: http://reviews.llvm.org/D7107 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228556 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 01:13:13 +00:00
Bjorn Steinbrink	61a16d2a16	Correctly combine alias.scope metadata by a union instead of intersecting Summary: The alias.scope metadata represents sets of things an instruction might alias with. When generically combining the metadata from two instructions the result must be the union of the original sets, because the new instruction might alias with anything any of the original instructions aliased with. Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7490 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228525 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 17:07:14 +00:00
Benjamin Kramer	a54b82a9fe	ValueTracking: Make isBytewiseValue simpler and more powerful at the same time. Turns out there is a simpler way of checking that all bytes in a word are equal than binary decomposition. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228503 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 19:29:02 +00:00
Bjorn Steinbrink	2dd5f23a1d	Properly update AA metadata when performing call slot optimization Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7482 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228500 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 17:54:36 +00:00
Matthias Braun	2f2dec87fb	InstCombine: Combine select sequences into a single select Normalize select(C0, select(C1, a, b), b) -> select((C0 & C1), a, b) select(C0, a, select(C1, a, b)) -> select((C0 \| C1), a, b) This normal form may enable further combines on the And/Or and shortens paths for the values. Many targets prefer the other but can go back easily in CodeGen. Differential Revision: http://reviews.llvm.org/D7399 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228409 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 17:49:36 +00:00
Michael Kuperstein	acd7b00be2	Teach isDereferenceablePointer() to look through bitcast constant expressions. This fixes a LICM regression due to the new load+store pair canonicalization. Differential Revision: http://reviews.llvm.org/D7411 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228284 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 09:15:37 +00:00
Cameron Esfahani	d02540a1d7	Value soft float calls as more expensive in the inliner. Summary: When evaluating floating point instructions in the inliner, ask the TTI whether it is an expensive operation. By default, it's not an expensive operation. This keeps the default behavior the same as before. The ARM TTI has been updated to return back TCC_Expensive for targets which don't have hardware floating point. Reviewers: chandlerc, echristo Reviewed By: echristo Subscribers: t.p.northover, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6936 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228263 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 02:09:33 +00:00
Tom Stellard	7c038bc15f	StructurizeCFG: Use a reverse post-order traversal We were previously doing a post-order traversal and operating on the list in reverse, however this would occasionaly cause backedges for loops to be visited before some of the other blocks in the loop. We know use a reverse post-order traversal, which avoids this issue. The reverse post-order traversal is not completely ideal, so we need to manually fixup the list to ensure that inner loop backedges are visited before outer loop backedges. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228186 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 20:49:44 +00:00
Renato Golin	ff01f89466	Reverting VLD1/VST1 base-updating/post-incrementing combining This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228129 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 10:11:59 +00:00
Daniel Berlin	403050abcc	Allow PRE to insert no-cost phi nodes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228024 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 20:37:08 +00:00
Jingyue Wu	2918efd551	Add straight-line strength reduction to LLVM Summary: Straight-line strength reduction (SLSR) is implemented in GCC but not yet in LLVM. It has proven to effectively simplify statements derived from an unrolled loop, and can potentially benefit many other cases too. For example, LLVM unrolls #pragma unroll foo (int i = 0; i < 3; ++i) { sum += foo((b + i) * s); } into sum += foo(b * s); sum += foo((b + 1) * s); sum += foo((b + 2) * s); However, no optimizations yet reduce the internal redundancy of the three expressions: b * s (b + 1) * s (b + 2) * s With SLSR, LLVM can optimize these three expressions into: t1 = b * s t2 = t1 + s t3 = t2 + s This commit is only an initial step towards implementing a series of such optimizations. I will implement more (see TODO in the file commentary) in the near future. This optimization is enabled for the NVPTX backend for now. However, I am more than happy to push it to the standard optimization pipeline after more thorough performance tests. Test Plan: test/StraightLineStrengthReduce/slsr.ll Reviewers: eliben, HaoLiu, meheff, hfinkel, jholewinski, atrick Reviewed By: jholewinski, atrick Subscribers: karthikthecool, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7310 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228016 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:37:06 +00:00
Erik Eckstein	40d542097a	Fix: SLPVectorizer crashes with assertion when vectorizing a cmp instruction. The commit r225977 uncovered this bug. The problem was that the vectorizer tried to read the second operand of an already deleted instruction. The bug didn't show up before r225977 because the freed memory still contained a non-null pointer. With r225977 deletion of instructions is delayed and the read operand pointer is always null. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227800 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 12:45:34 +00:00
Chandler Carruth	9a941b2028	[PM] Port SimplifyCFG to the new pass manager. This should be sufficient to replace the initial (minor) function pass pipeline in Clang with the new pass manager. I'll probably add an (off by default) flag to do that just to ensure we can get extra testing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227726 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 11:34:21 +00:00
Chandler Carruth	80c55f265d	[PM] Port EarlyCSE to the new pass manager. I've added RUN lines both to the basic test for EarlyCSE and the target-specific test, as this serves as a nice test that the TTI layer in the new pass manager is in fact working well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227725 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 10:51:23 +00:00
Adrian Prantl	88deac4007	Inliner: Use replaceDbgDeclareForAlloca() instead of splicing the instruction and generalize it to optionally dereference the variable. Follow-up to r227544. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227604 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 19:37:48 +00:00
Hao Liu	2f45a3c252	Move the target specific test case arbitrary-induction-step.ll to test/Transforms/LoopVectorize/AArch64 folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227561 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 07:33:31 +00:00
Hao Liu	e7769db118	[LoopVectorize] Induction variables: support arbitrary constant step. Previously, only -1 and +1 step values are supported for induction variables. This patch extends LV to support arbitrary constant steps. Initial patch by Alexey Volkov. Some bug fixes are added in the following version. Differential Revision: http://reviews.llvm.org/D6051 and http://reviews.llvm.org/D7193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227557 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 05:02:21 +00:00
Adrian Prantl	e413c8afe0	Fix PR22386. The inliner moves static allocas to the entry basic block so we need to move the dbg.declare intrinsics that describe them, too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227544 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 01:55:25 +00:00
Sanjay Patel	26c81cc870	[GVN] don't propagate equality comparisons of FP zero (PR22376) In http://reviews.llvm.org/D6911, we allowed GVN to propagate FP equalities to allow some simple value range optimizations. But that introduced a bug when comparing to -0.0 or 0.0: these compare equal even though they are not bitwise identical. This patch disallows propagating zero constants in equality comparisons. Fixes: http://llvm.org/bugs/show_bug.cgi?id=22376 Differential Revision: http://reviews.llvm.org/D7257 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227491 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 20:51:49 +00:00
Philip Reames	61a76b2d4a	Teach SplitBlockPredecessors how to handle landingpad blocks. Patch by: Igor Laevsky <igor@azulsystems.com> "Currently SplitBlockPredecessors generates incorrect code in case if basic block we are going to split has a landingpad. Also seems like it is fairly common case among it's users to conditionally call either SplitBlockPredecessors or SplitLandingPadPredecessors. Because of this I think it is reasonable to add this condition directly into SplitBlockPredecessors." Differential Revision: http://reviews.llvm.org/D7157 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227390 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 23:06:47 +00:00
Michael Kuperstein	0906c8fc1c	[X86] Reduce some 32-bit imuls into lea + shl Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well. Differential Revision: http://reviews.llvm.org/D7196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227308 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 14:08:22 +00:00
Elena Demikhovsky	b5c82c079a	Fold fcmp in cases where value is provably non-negative. By Arch Robison. This patch folds fcmp in some cases of interest in Julia. The patch adds a function CannotBeOrderedLessThanZero that returns true if a value is provably not less than zero. I.e. the function returns true if the value is provably -0, +0, positive, or a NaN. The patch extends InstructionSimplify.cpp to fold instances of fcmp where: - the predicate is olt or uge - the first operand is provably not less than zero - the second operand is zero The motivation for handling these cases optimizing away domain checks for sqrt in Julia for common idioms such as sqrt(xx+yy).. http://reviews.llvm.org/D6972 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227298 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 08:03:58 +00:00
Reid Kleckner	0935e7a79b	Move EH personality type classification to Analysis/LibCallSemantics.h Summary: Also add enum types for __C_specific_handler and _CxxFrameHandler3 for which we know a few things. Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227284 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 01:17:38 +00:00
Ahmed Bougacha	37be0d7c43	[SimplifyLibCalls] Don't confuse strcpy_chk for stpcpy_chk. This was introduced in a faulty refactoring (r225640, mea culpa): the tests weren't testing the return values, so, for both __strcpy_chk and __stpcpy_chk, we would return the end of the buffer (matching stpcpy) instead of the beginning (for strcpy). The root cause was the prefix "__" being ignored when comparing, which made us always pick LibFunc::stpcpy_chk. Pass the LibFunc::Func directly to avoid this kind of error. Also, make the testcases as explicit as possible to prevent this. The now-useful testcases expose another, entangled, stpcpy problem, with the further simplification. This was introduced in a refactoring (r225640) to match the original behavior. However, this leads to problems when successive simplifications generate several similar instructions, none of which are removed by the custom replaceAllUsesWith. For instance, InstCombine (the main user) doesn't erase the instruction in its custom RAUW. When trying to simplify say __stpcpy_chk: - first, an stpcpy is created (fortified simplifier), - second, a memcpy is created (normal simplifier), but the stpcpy call isn't removed. - third, InstCombine later revisits the instructions, and simplifies the first stpcpy to a memcpy. We now have two memcpys. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227250 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 21:52:16 +00:00
Sanjoy Das	fdefc694cd	Teach IRCE to look at branch weights when recognizing range checks Splitting a loop to make range checks redundant is profitable only if the range check "never" fails. Make this fact a part of recognizing a range check -- a branch is a range check only if it is expected to pass (via branch_weights metadata). Differential Revision: http://reviews.llvm.org/D7192 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227249 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 21:38:12 +00:00
Andrea Di Biagio	944d86558e	[InstCombine] Teach how to fold a select into a cttz/ctlz with the 'is_zero_undef' flag. This patch teaches the Instruction Combiner how to fold a cttz/ctlz followed by a icmp plus select into a single cttz/ctlz with flag 'is_zero_undef' cleared. Added test InstCombine/select-cmp-cttz-ctlz.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 15:58:14 +00:00
David Majnemer	90c42ddc62	LoopRotate: Don't walk the uses of a Constant LoopRotate wanted to avoid live range interference by looking at the uses of a Value in the loop latch and seeing if any lied outside of the loop. We would wrongly perform this operation on Constants. This fixes PR22337. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 06:21:43 +00:00
Chad Rosier	13faabb6c5	Commoning of target specific load/store intrinsics in Early CSE. Phabricator revision: http://reviews.llvm.org/D7121 Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 22:51:15 +00:00
Philip Reames	c1beae0d42	Add test cases for PRE w/volatile loads These tests check that the combination of 227110 (cross block query inst) and 227112 (volatile load semantics) work together properly to allow PRE in cases where a loop contains a volatile access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227146 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 22:40:44 +00:00
Hans Wennborg	325385a37f	SimplifyCFG: Omit range checks for switch lookup tables when default is unreachable The range check would get optimized away later, but we might as well not emit them in the first place. http://reviews.llvm.org/D6471 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:52:34 +00:00
Hans Wennborg	d5c2318adc	SimplifyCFG: don't remove unreachable default switch destinations An unreachable default destination can be exploited by other optimizations and allows for more efficient lowering. Both the SDag switch lowering and LowerSwitch can exploit unreachable defaults. Also make TurnSwitchRangeICmp handle switches with unreachable default. This is kind of separate change, but it cannot be tested without the change above, and I don't want to land the change above without this since that would regress other tests. Differential Revision: http://reviews.llvm.org/D6471 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227125 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:52:32 +00:00
Philip Reames	cce3c83917	Refine memory dependence's notion of volatile semantics According to my reading of the LangRef, volatiles are only ordered with respect to other volatiles. It is entirely legal and profitable to forward unrelated loads over the volatile load. This patch implements this for GVN by refining the transition rules MemoryDependenceAnalysis uses when encountering a volatile. The added test cases show where the extra flexibility is profitable for local dependence optimizations. I have a related change (227110) which will extend this to non-local dependence (i.e. PRE), but that's essentially orthogonal to the semantic change in this patch. I have tested the two together and can confirm that PRE works over a volatile load with both changes. I will be submitting a PRE w/volatiles test case seperately in the near future. Differential Revision: http://reviews.llvm.org/D6901 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227112 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 18:54:27 +00:00
Philip Reames	8a5ad05c13	Pass QueryInst down through non-local dependency calculation This change is mostly motivated by exposing information about the original query instruction to the actual scanning work in getPointerDependencyFrom when used by GVN PRE. In a follow up change, I will use this to be more precise with regards to the semantics of volatile instructions encountered in the scan of a basic block. Worth noting, is that this change (despite appearing quite simple) is not semantically preserving. By providing more information to the helper routine, we allow some optimizations to kick in that weren't previously able to (when called from this code path.) In particular, we see that treatment of !invariant.load becomes more precise. In theory, we might see a difference with an ordered/atomic instruction as well, but I'm having a hard time actually finding a test case which shows that. Test wise, I've included new tests for !invariant.load which illustrate this difference. I've also included some updated TBAA tests which highlight that this change isn't needed for that optimization to kick in - it's handled inside alias analysis itself. Eventually, it would be nice to factor the !invariant.load handling inside alias analysis as well. Differential Revision: http://reviews.llvm.org/D6895 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227110 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 18:39:52 +00:00
Erik Eckstein	8f6e8cb4f6	SLPVectorizer: fix wrong scheduling of atomic load/stores. This fixes PR22306. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227077 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 09:07:04 +00:00
Chandler Carruth	d4f6d111c1	[PM] Port LowerExpectIntrinsic to the new pass manager. This just lifts the logic into a static helper function, sinks the legacy pass to be a trivial wrapper of that helper fuction, and adds a trivial wrapper for the new PM as well. Not much to see here. I switched a test case to run in both modes, but we have to strip the dead prototypes separately as that pass isn't in the new pass manager (yet). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226999 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 11:13:02 +00:00
Chandler Carruth	7a98df7f74	[PM] Port instcombine to the new pass manager! This is exciting as this is a much more involved port. This is a complex, existing transformation pass. All of the core logic is shared between both old and new pass managers. Only the access to the analyses is separate because the actual techniques are separate. This also uses a bunch of different and interesting analyses and is the first time where we need to use an analysis across an IR layer. This also paves the way to expose instcombine utility functions. I've got a static function that implements the core pass logic over a function which might be mildly interesting, but more interesting is likely exposing a routine which just uses instructions already in the worklist and combines until empty. I've switched one of my favorite instcombine tests to run with both as well to make sure this keeps working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226987 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 04:19:17 +00:00
Hans Wennborg	01e223e92e	LowerSwitch: replace unreachable default with popular case destination SimplifyCFG currently does this transformation, but I'm planning to remove that to allow other passes, such as this one, to exploit the unreachable default. This patch takes care to keep track of what case values are unreachable even after the transformation, allowing for more efficient lowering. Differential Revision: http://reviews.llvm.org/D6697 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226934 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 20:43:51 +00:00
Reid Kleckner	7ed0364cee	Revert "Don't remove a landing pad if the invoke requires a table entry." This reverts commit r176827. Björn Steinbrink pointed out that this didn't actually fix the bug (PR15555) it was attempting to fix. With this reverted, we can now remove landingpad cleanups that immediately resume unwinding, converting the invoke to a call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 19:29:46 +00:00
Sanjoy Das	49afc9109e	Fix crashes in IRCE caused by mismatched types There are places where the inductive range check elimination pass depends on two llvm::Values or llvm::SCEVs to be of the same llvm::Type when they do not need to be. This patch relaxes those restrictions (by bailing out of the optimization if the types mismatch), and adds test cases to trigger those paths. These issues were found by bootstrapping clang with IRCE running in the -O3 pass ordering. Differential Revision: http://reviews.llvm.org/D7082 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226793 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 08:29:18 +00:00
Elena Demikhovsky	05e7ae1a7b	Fixed a bug in masked load/store in reversed loop. Added a test. The bug was submitted to bugzilla: http://llvm.org/bugs/show_bug.cgi?id=22225 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226791 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 08:20:06 +00:00
Chandler Carruth	b778cbc0c8	[canonicalize] Teach InstCombine to canonicalize loads which are only ever stored to always use a legal integer type if one is available. Regardless of whether this particular type is good or bad, it ensures we don't get weird differences in generated code (and resulting performance) from "equivalent" patterns that happen to end up using a slightly different type. After some discussion on llvmdev it seems everyone generally likes this canonicalization. However, there may be some parts of LLVM that handle it poorly and need to be fixed. I have at least verified that this doesn't impede GVN and instcombine's store-to-load forwarding powers in any obvious cases. Subtle cases are exactly what we need te flush out if they remain. Also note that this IR pattern should already be hitting LLVM from Clang at least because it is exactly the IR which would be produced if you used memcpy to copy a pointer or floating point between memory instead of a variable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226781 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 05:08:12 +00:00
David Blaikie	f93662d3d5	DebugInfo: Use distinct inlinedAt MDLocations to avoid separate inlined calls being coalesced When two calls from the same MDLocation are inlined they currently get treated as one inlined function call (creating difficulty debugging, duplicate variables, etc). Clang worked around this by including column information on inline calls which doesn't address LTO inlining or calls to the same function from the same line and column (such as through a macro). It also didn't address ctor and member function calls. By making the inlinedAt locations distinct, every call site has an explicitly distinct location that cannot be coalesced with any other call. This can produce linearly (2x in the worst case where every call is inlined and the call instruction has a non-call instruction at the same location) more debug locations. Any increase beyond that are in cases where the Clang workaround was insufficient and the new scheme is creating necessary distinct nodes that were being erroneously coalesced previously. After this change to LLVM the incomplete workarounds in Clang. That should reduce the number of debug locations (in a build without column info, the default on Darwin, not the default on Linux) by not creating pseudo-distinct locations for every call to an inline function. (oh, and I made the inlined-at chain rebuilding iterative instead of recursive because I was having trouble wrapping my head around it the way it was - open to discussion on the right design for that function (including going back to a recursive solution)) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226736 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:57:29 +00:00
David Majnemer	c070e4e528	InstCombine: Don't strip bitcasts off of callsites marked 'thunk' The return type of a thunk is meaningless, we just want the arguments and return value to be forwarded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226708 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:32:04 +00:00
Alexander Potapenko	506c6ec22a	Use a smaller pragma unroll threshold to reduce test execution time. When opt is compiled with AddressSanitizer it takes more than 30 seconds to unroll the loop in unroll_1M(). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226660 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 13:52:02 +00:00
Karthik Bhat	7e9f120130	Fix Operandreorder logic in SLPVectorizer to generate longer vectorizable chain. This patch fixes 2 issues in reorderInputsAccordingToOpcode 1) AllSameOpcodeLeft and AllSameOpcodeRight was being calculated incorrectly resulting in code not being vectorized in few cases. 2) Adds logic to reorder operands if we get longer chain of consecutive loads enabling vectorization. Handled the same for cases were we have AltOpcode. Thanks Michael for inputs and review. Review: http://reviews.llvm.org/D6677 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226547 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 06:11:00 +00:00
Mehdi Amini	525f296ef1	Fix Reassociate handling of constant in presence of undef float http://reviews.llvm.org/D6993 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226245 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 03:00:58 +00:00
Sanjoy Das	148e8c9b8b	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. This pass was originally r226201. It was reverted because it used C++ features not supported by MSVC 2012. Differential Revision: http://reviews.llvm.org/D6693 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226238 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 01:03:22 +00:00
Sanjoy Das	df1b4f601d	Revert r226201 (Add a new pass "inductive range check elimination") The change used C++11 features not supported by MSVC 2012. I will fix the change to use things supported MSVC 2012 and recommit shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226216 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 22:18:10 +00:00
Sanjoy Das	0170a308ec	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226201 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 20:45:46 +00:00
Sanjoy Das	7ec1829823	Fix PR22222 The bug was introduced in r225282. r225282 assumed that sub X, Y is the same as add X, -Y. This is not correct if we are going to upgrade the sub to sub nuw. This change fixes the issue by making the optimization ignore sub instructions. Differential Revision: http://reviews.llvm.org/D6979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226075 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:46:09 +00:00
Richard Smith	ef7d38d35a	For PR21145: recognise a builtin call to a known deallocation function even if it's defined in the current module. Clang generates this situation for the C++14 sized deallocation functions, because it generates a weak definition in case one isn't provided by the C++ runtime library. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226069 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:00:33 +00:00
Ramkumar Ramachandra	fba4d82671	[GC] CodeGenPrep transform: simplify offsetable relocate The transform is somewhat involved, but the basic idea is simple: find derived pointers that have been offset from the base pointer using gep and replace the relocate of the derived pointer with a gep to the relocated base pointer (with the same offset). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226060 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 23:27:07 +00:00
Duncan P. N. Exon Smith	37ac8d3622	IR: Move MDLocation into place This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226048 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 22:27:36 +00:00
David Majnemer	5e8cd99f55	InstCombine: Don't take A-B<0 into A<B if A-B has other uses This fixes PR22226. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226023 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 19:26:56 +00:00
Ahmed Bougacha	61d6dc41fa	[SimplifyLibCalls] Don't try to simplify indirect calls. It turns out, all callsites of the simplifier are guarded by a check for CallInst::getCalledFunction (i.e., to make sure the callee is direct). This check wasn't done when trying to further optimize a simplified fortified libcall, introduced by a refactoring in r225640. Fix that, add a testcase, and document the requirement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225895 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 00:55:05 +00:00
Sanjay Patel	2211d38267	GVN: propagate equalities for floating point compares Allow optimizations based on FP comparison values in the same way as integers. This resolves PR17713: http://llvm.org/bugs/show_bug.cgi?id=17713 Differential Revision: http://reviews.llvm.org/D6911 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225660 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 19:29:48 +00:00
Hal Finkel	6829815d96	[PowerPC] Readjust the loop unrolling threshold Now that the way that the partial unrolling threshold for small loops is used to compute the unrolling factor as been corrected, a slightly smaller threshold is preferable. This is expected; other targets may need to re-tune as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225566 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 00:31:10 +00:00
Hal Finkel	a14d6f1ea5	[LoopUnroll] Fix the partial unrolling threshold for small loop sizes When we compute the size of a loop, we include the branch on the backedge and the comparison feeding the conditional branch. Under normal circumstances, these don't get replicated with the rest of the loop body when we unroll. This led to the somewhat surprising behavior that really small loops would not get unrolled enough -- they could be unrolled more and the resulting loop would be below the threshold, because we were assuming they'd take (LoopSize * UnrollingFactor) instructions after unrolling, instead of (((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225565 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 00:30:55 +00:00
Hans Wennborg	ca71be6415	SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225552 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 22:13:31 +00:00
Tim Northover	8cd39a2630	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 19:19:56 +00:00
Hal Finkel	139bfee84c	[PowerPC] Enable late partial unrolling on the POWER7 The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 15:51:16 +00:00
Duncan P. N. Exon Smith	f416d72973	IR: Add 'distinct' MDNodes to bitcode and assembly Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:38:29 +00:00
Matt Arsenault	3b1f741856	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225465 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:09:34 +00:00
Matt Arsenault	374b57cec9	Fix using wrong intrinsic in test This is a leftover from renaming the intrinsic. It's surprising the unknown llvm. intrinsic wasn't rejected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225304 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:33 +00:00
Rafael Espindola	f907a26bc2	Change the .ll syntax for comdats and add a syntactic sugar. In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:55:16 +00:00
Sanjoy Das	31123d4529	This patch teaches IndVarSimplify to add nuw and nsw to certain kinds of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225282 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:02:56 +00:00
Matt Arsenault	d883ca0ca7	Convert fcmp with 0.0 from casted integers to icmp This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225265 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 15:50:59 +00:00
David Majnemer	51e4a66417	InstCombine: Bitcast call arguments from/to pointer/integer type Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225254 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:41:31 +00:00
Michael Kuperstein	25903ef9bc	Fix broken test from r225159. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225164 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:34:01 +00:00
Jiangning Liu	614fe873ce	Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm. {code} // loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4) {code} The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed. For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized. The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225159 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 10:08:58 +00:00
Chandler Carruth	4f9a7277d1	[SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an assert out of the new pre-splitting in SROA. This fix makes the code do what was originally intended -- when we have a store of a load both dealing in the same alloca, we force them to both be pre-split with identical offsets. This is really quite hard to do because we can keep discovering problems as we go along. We have to track every load over the current alloca which for any resaon becomes invalid for pre-splitting, and go back to remove all stores of those loads. I've included a couple of test cases derived from PR22093 that cover the different ways this can happen. While that PR only really triggered the first of these two, its the same fundamental issue. The other challenge here is documented in a FIXME now. We end up being quite a bit more aggressive for pre-splitting when loads and stores don't refer to the same alloca. This aggressiveness comes at the cost of introducing potentially redundant loads. It isn't clear that this is the right balance. It might be considerably better to require that we only do pre-splitting when we can presplit every load and store involved in the entire operation. That would give more consistent if conservative results. Unfortunately, it requires a non-trivial change to the actual pre-splitting operation in order to correctly handle cases where we end up pre-splitting stores out-of-order. And it isn't 100% clear that this is the right direction, although I'm starting to suspect that it is. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 04:17:53 +00:00
David Majnemer	07d7dbae9e	InstCombine: match can find ConstantExprs, don't assume we have a Value We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225127 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:36:02 +00:00
David Majnemer	77e22b7836	ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes PHI nodes can have zero operands in the middle of a transform. It is expected that utilities in Analysis don't freak out when this happens. Note that it is considered invalid to allow these misshapen phi nodes to make it to another pass. This fixes PR22086. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:06:53 +00:00
David Majnemer	5e9c6212a8	InstCombine: Detect when llvm.umul.with.overflow always overflows We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero and LHSKnownOne * RHSKnownOne overflow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225077 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 07:29:47 +00:00
Chandler Carruth	ce7f347da2	[SROA] Teach SROA to be more aggressive in splitting now that we have a pre-splitting pass over loads and stores. Historically, splitting could cause enough problems that I hamstrung the entire process with a requirement that splittable integer loads and stores must cover the entire alloca. All smaller loads and stores were unsplittable to prevent chaos from ensuing. With the new pre-splitting logic that does load/store pair splitting I introduced in r225061, we can now very nicely handle arbitrarily splittable loads and stores. In order to fully benefit from these smarts, we need to mark all of the integer loads and stores as splittable. However, we don't actually want to rewrite partitions with all integer loads and stores marked as splittable. This will fail to extract scalar integers from aggregates, which is kind of the point of SROA. =] In order to resolve this, what we really want to do is only do pre-splitting on the alloca slices with integer loads and stores fully splittable. This allows us to uncover all non-integer uses of the alloca that would benefit from a split in an integer load or store (and where introducing the split is safe because it is just memory transfer from a load to a store). Once done, we make all the non-whole-alloca integer loads and stores unsplittable just as they have historically been, repartition and rewrite. The result is that when there are integer loads and stores anywhere within an alloca (such as from a memcpy of a sub-object of a larger object), we can split them up if there are non-integer components to the aggregate hiding beneath. I've added the challenging test cases to demonstrate how this is able to promote to scalars even a case where we have even partially overlapping loads and stores. This restores the single-store behavior for small arrays of i8s which is really nice. I've restored both the little endian testing and big endian testing for these exactly as they were prior to r225061. It also forced me to be more aggressive in an alignment test to actually defeat SROA. =] Without the added volatiles there, we actually split up the weird i16 loads and produce nice double allocas with better alignment. This also uncovered a number of bugs where we failed to handle splittable load and store slices which didn't have a begininng offset of zero. Those fixes are included, and without them the existing test cases explode in glorious fireworks. =] I've kept support for leaving whole-alloca integer loads and stores as splittable even for the purpose of rewriting, but I think that's likely no longer needed. With the new pre-splitting, we might be able to remove all the splitting support for loads and stores from the rewriter. Not doing that in this patch to try to isolate any performance regressions that causes in an easy to find and revert chunk. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225074 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 03:55:54 +00:00
Chandler Carruth	40a8741994	[SROA] Add a test case for r225068 / PR22080. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225070 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 00:34:29 +00:00
Chandler Carruth	450b39e971	[SROA] Teach SROA how to much more intelligently handle split loads and stores. When there are accesses to an entire alloca with an integer load or store as well as accesses to small pieces of the alloca, SROA splits up the large integer accesses. In order to do that, it uses bit math to merge the small accesses into large integers. While this is effective, it produces insane IR that can cause significant problems in the rest of the optimizer: - It can cause load and store mismatches with GVN on the non-alloca side where we end up loading an i64 (or some such) rather than loading specific elements that are stored. - We can't always get rid of the integer bit math, which is why we can't always fix the loads and stores to work well with GVN. - This is especially bad when we have operations that mix poorly with integer bit math such as floating point operations. - It will block things like the vectorizer which might be able to handle the scalar stores that underly the aggregate. At the same time, we can't just directly split up these loads and stores in all cases. If there is actual integer arithmetic involved on the values, then using integer bit math is actually the perfect lowering because we can often combine it heavily with the surrounding math. The solution this patch provides is to find places where SROA is partitioning aggregates into small elements, and look for splittable loads and stores that it can split all the way to some other adjacent load and store. These are uniformly the cases where failing to split the loads and stores hurts the optimizer that I have seen, and I've looked extensively at the code produced both from more and less aggressive approaches to this problem. However, it is quite tricky to actually do this in SROA. We may have loads and stores to the same alloca, or other complex patterns that are hard to handle. This complexity leads to the somewhat subtle algorithm implemented here. We have to do this entire process as a separate pass over the partitioning of the alloca, and split up all of the loads prior to splitting the stores so that we can handle safely the cases of overlapping, including partially overlapping, loads and stores to the same alloca. We also have to reconstitute the post-split slice configuration so we can avoid iterating again over all the alloca uses (the slow part of SROA). But we also have to ensure that when we split up loads and stores to other allocas, we do re-iterate over them in SROA to adapt to the more refined partitioning now required. With this, I actually think we can fix a long-standing TODO in SROA where I avoided splitting as many loads and stores as probably should be splittable. This limitation historically mitigated the fallout of all the bad things mentioned above. Now that we have more intelligent handling, I plan to remove the FIXME and more aggressively mark integer loads and stores as splittable. I'll do that in a follow-up patch to help with bisecting any fallout. The net result of this change should be more fine-grained and accurate scalars being formed out of aggregates. At the very least, Clang now generates perfect code for this high-level test case using std::complex<float>: #include <complex> void g1(std::complex<float> &x, float a, float b) { x += std::complex<float>(a, b); } void g2(std::complex<float> &x, float a, float b) { x -= std::complex<float>(a, b); } void foo(const std::complex<float> &x, float a, float b, std::complex<float> &x1, std::complex<float> &x2) { std::complex<float> l1 = x; g1(l1, a, b); std::complex<float> l2 = x; g2(l2, a, b); x1 = l1; x2 = l2; } This code isn't just hypothetical either. It was reduced out of the hot inner loops of essentially every part of the Eigen math library when using std::complex<float>. Those loops would consistently and pervasively hop between the floating point unit and the integer unit due to bit math extraction and insertion of floating point values that were "stored" in a 64-bit integer register around the loop backedge. So far, this change has passed a bootstrap and I have done some other testing and so far, no issues. That doesn't mean there won't be though, so I'll be prepared to help with any fallout. If you performance swings in particular, please let me know. I'm very curious what all the impact of this change will be. Stay tuned for the follow-up to also split more integer loads and stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225061 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-01 11:54:38 +00:00
Sanjay Patel	28650b8ec2	InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225050 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 22:14:05 +00:00
David Majnemer	0f77ccd6bb	InstCombine: try to transform A-B < 0 into A < B We are allowed to move the 'B' to the right hand side if we an prove there is no signed overflow and if the comparison itself is signed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225034 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 04:21:41 +00:00
Philip Reames	91a083c57f	Carry facts about nullness and undef across GC relocation This change implements four basic optimizations: If a relocated value isn't used, it doesn't need to be relocated. If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.) If the value being relocated is undef, the relocation is meaningless. If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.) I outlined other planned work in comments. Differential Revision: http://reviews.llvm.org/D6600 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224968 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 23:27:30 +00:00
Philip Reames	1714ad67bd	Refine the notion of MayThrow in LICM to include a header specific version In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up. This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs. define void @nothrow_header(i64 %x, i64 %y, i1 %cond) { ; CHECK-LABEL: nothrow_header ; CHECK-LABEL: entry ; CHECK: %div = udiv i64 %x, %y ; CHECK-LABEL: loop ; CHECK: call void @use(i64 %div) entry: br label %loop loop: ; preds = %entry, %for.inc %div = udiv i64 %x, %y br i1 %cond, label %loop-if, label %exit loop-if: call void @use(i64 %div) br label %loop exit: ret void } The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load. The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata. Differential Revision: http://reviews.llvm.org/D6725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224965 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 23:00:57 +00:00
Philip Reames	456b7b602c	Loading from null is valid outside of addrspace 0 This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen. This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces. We really should introduce a hook to control this property on a per target per address space basis. We may be loosing valuable optimizations in some address spaces by being too conservative. Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224961 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 22:46:21 +00:00
David Majnemer	7627d9c229	InstCombine: Infer nuw for multiplies A multiply cannot unsigned wrap if there are bitwidth, or more, leading zero bits between the two operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224849 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 09:50:35 +00:00
David Majnemer	998ae69abe	InstCombe: Infer nsw for multiplies We already utilize this logic for reducing overflow intrinsics, it makes sense to reuse it for normal multiplies as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 09:10:14 +00:00
Michael Kuperstein	a098c770e1	[ValueTracking] Move GlobalAlias handling to be after the max depth check in computeKnownBits() GlobalAlias handling used to be after GlobalValue handling, which meant it was, in practice, dead code. r220165 moved GlobalAlias handling to be before GlobalValue handling, but also moved it to be before the max depth check, causing an assert due to a recursion depth limit violation. This moves GlobalAlias handling forward to where it's safe, and changes the GlobalValue handling to only look at GlobalObjects. Differential Revision: http://reviews.llvm.org/D6758 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224765 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-23 11:33:41 +00:00
Michael Liao	b9e302f3ca	[SimplifyCFG] Revise common code sinking - Fix the case where more than 1 common instructions derived from the same operand cannot be sunk. When a pair of value has more than 1 derived values in both branches, only 1 derived value could be sunk. - Replace BB1 -> (BB2, PN) map with joint value map, i.e. map of (BB1, BB2) -> PN, which is more accurate to track common ops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224757 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-23 08:26:55 +00:00
Bruno Cardoso Lopes	a559a2317c	[LCSSA] Handle PHI insertion in disjoint loops Take two disjoint Loops L1 and L2. LoopSimplify fails to simplify some loops (e.g. when indirect branches are involved). In such situations, it can happen that an exit for L1 is the header of L2. Thus, when we create PHIs in one of such exits we are also inserting PHIs in L2 header. This could break LCSSA form for L2 because these inserted PHIs can also have uses in L2 exits, which are never handled in the current implementation. Provide a fix for this corner case and test that we don't assert/crash on that. Differential Revision: http://reviews.llvm.org/D6624 rdar://problem/19166231 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224740 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-22 22:35:46 +00:00
David Majnemer	6df827240e	This should have been part of r224676. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224677 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-20 04:48:34 +00:00
David Majnemer	854a37649a	InstCombine: Squash an icmp+select into bitwise arithmetic (X & INT_MIN) == 0 ? X ^ INT_MIN : X into X \| INT_MIN (X & INT_MIN) != 0 ? X ^ INT_MIN : X into X & INT_MAX This fixes PR21993. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224676 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-20 04:45:35 +00:00
David Majnemer	9cd99a0724	InstSimplify: Optimize away pointless comparisons (X & INT_MIN) ? X & INT_MAX : X into X & INT_MAX (X & INT_MIN) ? X : X & INT_MAX into X (X & INT_MIN) ? X \| INT_MIN : X into X (X & INT_MIN) ? X : X \| INT_MIN into X \| INT_MIN git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224669 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-20 03:04:38 +00:00
Bruno Cardoso Lopes	06833ca7c1	Reapply: [InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. Also, fix code to also return the modified switch when only the truncation is performed. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-19 17:12:35 +00:00
Sanjay Patel	7c5fa50875	use -0.0 when creating an fneg instruction Backends recognize (-0.0 - X) as the canonical form for fneg and produce better code. Eg, ppc64 with 0.0: lis r2, ha16(LCPI0_0) lfs f0, lo16(LCPI0_0)(r2) fsubs f1, f0, f1 blr vs. -0.0: fneg f1, f1 blr Differential Revision: http://reviews.llvm.org/D6723 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224583 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-19 16:44:08 +00:00
Bruno Cardoso Lopes	01b07d541b	Revert "[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr" Reverts commit r224574 to appease buildbots: The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224576 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-19 14:36:24 +00:00
Bruno Cardoso Lopes	cba407d019	[InstCombine] Fix visitSwitchInst to use right operand types for sub cstexpr The visitSwitchInst generates SUB constant expressions to recompute the switch condition. When truncating the condition to a smaller type, SUB expressions should use the previous type (before trunc) for both operands. This fixes an assertion crash. Differential Revision: http://reviews.llvm.org/D6644 rdar://problem/19191835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-19 14:23:15 +00:00
David Majnemer	73059bd1f1	ConstantFold: Shifting undef by zero results in undef git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224553 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-18 23:54:43 +00:00
Suyog Sarda	4bfc4f2e8c	Revert 224119 "This patch recognizes (+ (+ v0, v1) (+ v2, v3)), reorders them for bundling into vector of loads, and vectorizes it." This was re-ordering floating point data types resulting in mismatch in output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224424 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-17 10:34:27 +00:00
Elena Demikhovsky	982a8b3aeb	Added 5 more tests related to sink store revision 224247 - by Ella Bolshinsky http://reviews.llvm.org/D6420 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224418 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-17 08:12:59 +00:00
Erik Eckstein	96bd465d6c	Strength reduce intrinsics with overflow into regular arithmetic operations if possible. Some intrinsics, like s/uadd.with.overflow and umul.with.overflow, are already strength reduced. This change adds other arithmetic intrinsics: s/usub.with.overflow, smul.with.overflow. It completes the work on PR20194. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224417 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-17 07:29:19 +00:00
David Majnemer	891ec6d69f	InstSimplify: shl nsw/nuw undef, %V -> undef We can always choose an value for undef which might cause %V to shift out an important bit except for one case, when %V is zero. However, shl behaves like an identity function when the right hand side is zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224405 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-17 01:54:33 +00:00
Elena Demikhovsky	14fb445715	Masked Load and Store Intrinsics in loop vectorizer. The loop vectorizer optimizes loops containing conditional memory accesses by generating masked load and store intrinsics. This decision is target dependent. http://reviews.llvm.org/D6527 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224334 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-16 11:50:42 +00:00

... 2 3 4 5 6 ...

6480 Commits