llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-23 20:29:30 +00:00

Author	SHA1	Message	Date
Gerolf Hoflehner	9d4048578c	Revert commit r207302 since build failures have been reported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207303 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 02:03:17 +00:00
Gerolf Hoflehner	4c9277bb9f	RecursivelyDeleteTriviallyDeadInstructions() could remove more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207302 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 01:19:16 +00:00
Andrea Di Biagio	96db9b8ed8	[InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207299 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 01:03:22 +00:00
Adam Nemet	d761cc1dfa	[LoopStrengthReduce] Don't trim formula that uses a subset of required registers Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain all these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 21:02:21 +00:00
Manman Ren	3bd471dee2	[inline cold threshold] Command line argument for inline threshold will override the default cold threshold. When we use command line argument to set the inline threshold, the default cold threshold will not be used. This is in line with how we use OptSizeThreshold. When we want a higher threshold for all functions, we do not have to set both inline threshold and cold threshold. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207245 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 17:34:55 +00:00
Karthik Bhat	ac16f0e024	Allow vectorization of bit intrinsics in BB Vectorizer. This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207174 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 03:33:48 +00:00
Zinovy Nis	25209ab486	[CLNUP] Test commit. Remove newline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207089 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 08:42:58 +00:00
Karthik Bhat	0698b2b6cc	Allow vectorization of few missed llvm intrinsic calls in BBVectorizor by handling them in isVectorizableIntrinsic function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207085 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 07:29:55 +00:00
Michael J. Spencer	96363d5001	[InstCombine][x86] Constant fold psll intrinsics. This excludes avx512 as I don't have hardware to verify. It excludes _dq variants because they are represented in the IR as <{2,4} x i64> when it's actually a byte shift of the entire i{128,265}. This also excludes _dq_bs as they aren't at all supported by the backend. There are also no corresponding instructions in the ISA. I have no idea why they exist... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:58:18 +00:00
Filipe Cabecinhas	cd9f6b870e	Optimize some special cases for SSE4a insertqi Summary: Since the upper 64 bits of the destination register are undefined when performing this operation, we can substitute it and let the optimizer figure out that only a copy is needed. Also added range merging, if an instruction copies a range that can be merged with a previous copied range. Added test cases for both optimizations. Reviewers: grosbach, nadav CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3357 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207055 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:38:14 +00:00
Matt Arsenault	8bd9405026	Handle addrspacecast when looking at memcpys from globals git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:01:09 +00:00
Matt Arsenault	0e92fe9dce	Convert test to FileCheck git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-23 19:32:37 +00:00
Alexander Musman	bf255f5d5a	[LV] Statistics numbers for LoopVectorize introduced: a number of analyzed loops & a number of vectorized loops. Use -stats to see how many loops were analyzed for possible vectorization and how many of them were actually vectorized. Patch by Zinovy Nis Differential Revision: http://reviews.llvm.org/D3438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206956 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-23 08:40:37 +00:00
Juergen Ributzka	b95412cc24	[Constant Hoisting] Materialize the constant before the cloned cast instruction. In the case where the constant comes from a cloned cast instruction, the materialization code has to go before the cloned cast instruction. This commit fixes the method that finds the materialization insertion point by making it aware of this case. This fixes <rdar://problem/15532441> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206913 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-22 18:06:58 +00:00
Rafael Espindola	db0a73f31b	Simplify a vpermil* with constant mask. With a constant mask a vpermil* is just a shufflevector. This patch implements that simplification. This allows us to produce denser code. It should also allow more folding down the line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206801 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-21 22:06:04 +00:00
Reid Kleckner	0df9abbd63	Fix PR7272 in -tailcallelim instead of the inliner The -tailcallelim pass should be checking if byval or inalloca args can be captured before marking calls as tail calls. This was the real root cause of PR7272. With a better fix in place, revert the inliner change from r105255. The test case it introduced still passes and has been moved to test/Transforms/Inline/byval-tail-call.ll. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3403 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206789 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-21 20:48:47 +00:00
Jiangning Liu	eea662fead	Add missing config file for newly added test case introduced by r206563. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206567 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:05:50 +00:00
Jiangning Liu	a1da819896	This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64. A new test case is also added for ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206563 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 07:57:54 +00:00
Diego Novillo	0a0d620db3	Fix bug 19437 - Only add discriminators for DWARF 4 and above. Summary: This prevents the discriminator generation pass from triggering if the DWARF version being used in the module is prior to 4. Reviewers: echristo, dblaikie CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3413 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206507 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 22:33:50 +00:00
Gerolf Hoflehner	d5e9413512	Reverse 206485. After some discussions the preferred semantics of the always_inline attribute is inline always when the compiler can determine that it it safe to do so. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206487 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 19:14:06 +00:00
Tim Northover	09da6b5540	Atomics: promote ARM's IR-based atomics pass to CodeGen. Still only 32-bit ARM using it at this stage, but the promotion allows direct testing via opt and is a reasonably self-contained patch on the way to switching ARM64. At this point, other targets should be able to make use of it without too much difficulty if they want. (See ARM64 commit coming soon for an example). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206485 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 18:22:47 +00:00
Gerolf Hoflehner	d6312bbbbd	Inline a function when the always_inline attribute is set even when it contains a indirect branch. The attribute overrules correctness concerns like the escape of a local block address. This is for rdar://16501761 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206429 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 00:21:52 +00:00
Julien Lerouge	894b7f642c	Add lifetime markers for allocas created to hold byval arguments, make them appear in the InlineFunctionInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206308 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 18:06:46 +00:00
NAKAMURA Takumi	bc4e3c024c	vect.omp.persistence.ll REQUIRES asserts due to -debug-only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 10:12:47 +00:00
Alexey Bataev	15cbb64eb4	D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206266 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 09:37:30 +00:00
Matt Arsenault	448a1a0734	Revert "Revert r206045, "Fix shift by constants for vector."" Fix cases where the Value itself is used, and not the constant value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206214 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 21:50:37 +00:00
NAKAMURA Takumi	1377449177	Whitespace. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206154 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 07:03:13 +00:00
NAKAMURA Takumi	9854380054	Revert r206045, "Fix shift by constants for vector." It broke some builders, at least, i686. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206153 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 07:02:57 +00:00
Hal Finkel	b9ed50cf17	[PowerPC] [Constant Hoisting] Enable constant hoisting on PPC Implements the various TTI functions to enable constant hoisting on PPC. The only significant test-suite change is this: MultiSource/Benchmarks/VersaBench/bmm/bmm - 20% speedup (which essentially reverses the slowdown from r206120). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206141 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-13 23:02:40 +00:00
Serge Pavlov	ea0ea63773	Recognize test for overflow in integer multiplication. If multiplication involves zero-extended arguments and the result is compared as in the patterns: %mul32 = trunc i64 %mul64 to i32 %zext = zext i32 %mul32 to i64 %overflow = icmp ne i64 %mul64, %zext or %overflow = icmp ugt i64 %mul64 , 0xffffffff then the multiplication may be replaced by call to umul.with.overflow. This change fixes PR4917 and PR4918. Differential Revision: http://llvm-reviews.chandlerc.com/D2814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206137 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-13 18:23:41 +00:00
Juergen Ributzka	2aa7106dd6	[ARM64] Never hoist the shift value of a shift instruction. There is no need to check if we want to hoist the immediate value of an shift instruction. Simply return TCC_Free right away. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206101 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 02:53:51 +00:00
Juergen Ributzka	940b67465d	[ARM64] Fix the cost model for cheap large constants. Originally the cost model would give up for large constants and just return the maximum cost. This is not what we want for constant hoisting, because some of these constants are large in bitwidth, but are still cheap to materialize. This commit fixes the cost model to either return TCC_Free if the cost cannot be determined, or accurately calculate the cost even for large constants (bitwidth > 128). This fixes <rdar://problem/16591573>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206100 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 02:36:28 +00:00
Hal Finkel	24517d023f	Add the ability to use GEPs for address sinking in CGP The current memory-instruction optimization logic in CGP, which sinks parts of the address computation that can be adsorbed by the addressing mode, does this by explicitly converting the relevant part of the address computation into IR-level integer operations (making use of ptrtoint and inttoptr). For most targets this is currently not a problem, but for targets wishing to make use of IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a problem for two reasons: 1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr 2. In cases where type-punning was used, and BasicAA was used to override TBAA, BasicAA may no longer do so. (this had forced us to disable all use of TBAA in CodeGen; something which we can now enable again) This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by default (except for those targets that use AA during CodeGen), and so aside from some PowerPC subtargets and SystemZ, there should be no change in behavior. We may be able to switch completely away from the ptrtoint/inttoptr sinking on all targets, but further testing is required. I've doubled-up on a number of existing tests that are sensitive to the address sinking behavior (including some store-merging tests that are sensitive to the order of the resulting ADD operations at the SDAG level). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206092 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 00:59:48 +00:00
Matt Arsenault	fb33ce9956	Fix shift by constants for vector. ashr <N x iM>, <N x iM> M -> undef git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206045 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-11 17:57:53 +00:00
Arnold Schwaighofer	e2d124d396	Reapply "SLPVectorizer: Ignore users that are insertelements we can reschedule them" This commit reapplies 205018. After 205855 we should correctly vectorize intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205965 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 13:41:35 +00:00
Juergen Ributzka	d631cc4563	[ARM64] Fix immediate cost calculation for types larger than i64. The immediate cost calculation code was hitting an assertion in the included test case, because APInt was still internally 128-bits. Truncating it to 64-bits fixed the issue. Fixes <rdar://problem/16572521>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205947 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 01:36:59 +00:00
Arnold Schwaighofer	b0ee2374ce	SLPVectorizer: Only vectorize intrinsics whose operands are widened equally The vectorizer only knows how to vectorize intrinics by widening all operands by the same factor. Patch by Tyler Nowicki! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205855 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 14:20:47 +00:00
Juergen Ributzka	c6a7502a80	[Constant Hoisting][ARM64] Enable constant hoisting for ARM64. This implements the target-hooks for ARM64 to enable constant hoisting. This fixes <rdar://problem/14774662> and <rdar://problem/16381500>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205791 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-08 20:39:59 +00:00
Eric Christopher	9a764dfa92	Handle vlas during inline cost computation if they'll be turned into a constant size alloca by inlining. Ran a run over the testsuite, no results out of the noise, fixes the testcase in the PR. PR19115. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205710 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-07 13:36:21 +00:00
Juergen Ributzka	7330dcb873	Update the test to use FileCheck. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205647 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 19:57:01 +00:00
Saleem Abdulrasool	2abadea537	ARM: yet another round of ARM test clean ups git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205586 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 23:47:24 +00:00
Eli Bendersky	1fcd561c73	Fix PR19270 - type mismatch caused by invalid optimization. Patch by Jingyue Wu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205547 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 17:51:58 +00:00
Juergen Ributzka	a18ad697a9	Add test case for [Constant Hoisting] Erase dead cast instructions (r204538). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205484 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 23:06:22 +00:00
Adrian Prantl	e2ed9238bc	typo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205473 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 22:17:30 +00:00
Juergen Ributzka	172e0ca8c5	Add comments and test case for [X86TTI] Make constant base pointers for GetElementPtr opaque (r204739). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205468 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 21:45:36 +00:00
Juergen Ributzka	5b10f87138	Add test case for [Stackmaps][X86TTI] Fix think-o in getIntImmCost calculation (r204738). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205464 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 21:15:36 +00:00
Tim Northover	24e78e0125	SLPVectorizer: compare entire intrinsic for SLP compatibility. Some Intrinsics are overloaded to the extent that return type equality (all that's been checked up to now) does not guarantee that the arguments are the same. In these cases SLP vectorizer should not recurse into the operands, which can be achieved by comparing them as "Function *" rather than simply the ID. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205424 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 14:39:02 +00:00
Hal Finkel	081e6fcd17	[LoopVectorizer] Count dependencies of consecutive pointers as uniforms For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205387 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 02:34:49 +00:00
Hal Finkel	e30aa957e3	Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205348 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:50:34 +00:00
Hal Finkel	6bbb01bbf8	Move partial/runtime unrolling late in the pipeline The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205264 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 23:23:51 +00:00

1 2 3 4 5 ...

5561 Commits