llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 04:30:12 +00:00

Author	SHA1	Message	Date
Jiangning Liu	eea662fead	Add missing config file for newly added test case introduced by r206563. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206567 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:05:50 +00:00
Jiangning Liu	a1da819896	This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64. A new test case is also added for ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206563 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 07:57:54 +00:00
NAKAMURA Takumi	bc4e3c024c	vect.omp.persistence.ll REQUIRES asserts due to -debug-only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 10:12:47 +00:00
Alexey Bataev	15cbb64eb4	D3348 - [BUG] "Rotate Loop" pass kills "llvm.vectorizer.enable" metadata git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206266 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 09:37:30 +00:00
Hal Finkel	081e6fcd17	[LoopVectorizer] Count dependencies of consecutive pointers as uniforms For the purpose of calculating the cost of the loop at various vectorization factors, we need to count dependencies of consecutive pointers as uniforms (which means that the VF = 1 cost is used for all overall VF values). For example, the TSVC benchmark function s173 has: ... %3 = add nsw i64 %indvars.iv, 16000 %arrayidx8 = getelementptr inbounds %struct.GlobalData* @global_data, i64 0, i32 0, i64 %3 ... and we must realize that the add will be a scalar in order to correctly deduce it to be profitable to vectorize this on PowerPC with VSX enabled. In fact, all dependencies of a consecutive pointer must be a scalar (uniform), and so we simply need to add all consecutive pointers to the worklist that currently detects collects uniforms. Fixes PR19296. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205387 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 02:34:49 +00:00
Hal Finkel	e30aa957e3	Implement X86TTI::getUnrollingPreferences This provides an initial implementation of getUnrollingPreferences for x86. getUnrollingPreferences is used by the generic (concatenation) unroller, which is distinct from the unrolling done by the loop vectorizer. Many modern x86 cores have some kind of uop cache and loop-stream detector (LSD) used to efficiently dispatch small loops, and taking full advantage of this requires unrolling small loops (small here means 10s of uops). These caches also have limits on the number of taken branches in the loop, and so we also cap the loop unrolling factor based on the maximum "depth" of the loop. This is currently calculated with a partial DFS traversal (partial because it will stop early if the path length grows too much). This is still an approximation, and one that is both conservative (because it does not account for branches eliminated via block placement) and optimistic (because it is only recording the maximum depth over minimum paths). Nevertheless, because the loops that fit in these uop caches are so small, it is not clear how much the details matter. The original set of patches posted for review produced the following test-suite performance results (from the TSVC benchmark) at that time: ControlLoops-dbl - 13% speedup ControlLoops-flt - 15% speedup Reductions-dbl - 7.5% speedup git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205348 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:50:34 +00:00
Hal Finkel	6bbb01bbf8	Move partial/runtime unrolling late in the pipeline The generic (concatenation) loop unroller is currently placed early in the standard optimization pipeline. This is a good place to perform full unrolling, but not the right place to perform partial/runtime unrolling. However, most targets don't enable partial/runtime unrolling, so this never mattered. However, even some x86 cores benefit from partial/runtime unrolling of very small loops, and follow-up commits will enable this. First, we need to move partial/runtime unrolling late in the optimization pipeline (importantly, this is after SLP and loop vectorization, as vectorization can drastically change the size of a loop), while keeping the full unrolling where it is now. This change does just that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205264 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 23:23:51 +00:00
Adam Nemet	4ffbb65494	[X86] Adjust cost of FP_TO_UINT v4f64->v4i32 as well Pretty obvious follow-on to r205159 to also handle conversion from double besides float. Fixes <rdar://problem/16373208> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205253 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 21:54:48 +00:00
Adam Nemet	cb1800772a	[X86] Adjust cost of FP_TO_UINT v8f32->v8i32 There is no direct AVX instruction to convert to unsigned. I have some ideas how we may be able to do this with three vector instructions but the current backend just bails on this to get it scalarized. See the comment why we need to adjust the cost returned by BasicTTI. The test is a bit roundabout (and checks assembly rather than bit code) because I'd like it to work even if at some point we could vectorize this conversion. Fixes <rdar://problem/16371920> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205159 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-30 18:07:13 +00:00
Tim Northover	7b837d8c75	ARM64: initial backend import This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205090 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-29 10:18:08 +00:00
Quentin Colombet	566abecc9f	[X86][Vectorizer Cost Model] Correct vectorization cost model for v2i64->v2f64 and v4i64->v4f64. The new costs match what we did for SSE2 and reflect the reality of our codegen. <rdar://problem/16381225> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204884 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 00:52:16 +00:00
Jim Grosbach	e6cee9f723	add 'requires asserts' to test that needs it git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204882 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 00:20:42 +00:00
Jim Grosbach	f20d9ee6a6	X86: Correct vectorization cost model for v8f32->v8i8. Fix the cost model to reflect the reality of our codegen. rdar://16370633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204880 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 00:04:11 +00:00
Arnold Schwaighofer	9d84b4d70c	LoopVectorizer: Preserve fast-math flags Fixes PR19045. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203008 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 21:10:47 +00:00
Arnold Schwaighofer	846acbeef1	LoopVectorizer: Keep track of conditional store basic blocks Before conditional store vectorization/unrolling we had only one vectorized/unrolled basic block. After adding support for conditional store vectorization this will not only be one block but multiple basic blocks. The last block would have the back-edge. I updated the code to use a vector of basic blocks instead of a single basic block and fixed the users to use the last entry in this vector. But, I forgot to add the basic blocks to this vector! Fixes PR18724. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@201028 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-08 20:41:13 +00:00
Arnold Schwaighofer	a16c1b55e2	LoopVectorizer: Enable unrolling of conditional stores and the load/store unrolling heuristic per default Benchmarking on x86_64 (thanks Chandler!) and ARM has shown those options speed up some benchmarks while not causing any interesting regressions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-02 03:12:34 +00:00
Arnold Schwaighofer	991dd3bb92	ARMTTI: We don't have 16 allocatable scalar registers This caused an regression on libquantum after enabling the new loop vectorizer unroll heuristics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200616 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-01 18:00:25 +00:00
Chandler Carruth	93228f6199	[vectorizer] Tweak the way we do small loop runtime unrolling in the loop vectorizer to not do so when runtime pointer checks are needed and share code with the new (not yet enabled) load/store saturation runtime unrolling. Also ensure that we only consider the runtime checks when the loop hasn't already been vectorized. If it has, the runtime check cost has already been paid. I've fleshed out a test case to cover the scalar unrolling as well as the vector unrolling and comment clearly why we are or aren't following the pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200530 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-31 10:51:08 +00:00
Arnold Schwaighofer	fb70e11bbc	LoopVectorizer: Add a test case for unrolling of small loops that need a runtime check. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200408 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-29 18:55:44 +00:00
Chandler Carruth	05d43d8b6f	[vectorizer] Completely disable the block frequency guidance of the loop vectorizer, placing it behind an off-by-default flag. It turns out that block frequency isn't what we want at all, here or elsewhere. This has been I think a nagging feeling for several of us working with it, but Arnold has given some really nice simple examples where the results are so comprehensively wrong that they aren't useful. I'm planning to email the dev list with a summary of why its not really useful and a couple of ideas about how to better structure these types of heuristics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200294 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 09:10:41 +00:00
Arnold Schwaighofer	a47aa4b4ef	LoopVectorize: Support conditional stores by scalarizing The vectorizer takes a loop like this and widens all instructions except for the store. The stores are scalarized/unrolled and hidden behind an "if" block. for (i = 0; i < 128; ++i) { if (a[i] < 10) a[i] += val; } for (i = 0; i < 128; i+=2) { v = a[i:i+1]; v0 = (extract v, 0) + 10; v1 = (extract v, 1) + 10; if (v0 < 10) a[i] = v0; if (v1 < 10) a[i] = v1; } The vectorizer relies on subsequent optimizations to sink instructions into the conditional block where they are anticipated. The flag "vectorize-num-stores-pred" controls whether and how many stores to handle this way. Vectorization of conditional stores is disabled per default for now. This patch also adds a change to the heuristic when the flag "enable-loadstore-runtime-unroll" is enabled (off by default). It unrolls small loops until load/store ports are saturated. This heuristic uses TTI's getMaxUnrollFactor as a measure for load/store ports. I also added a second flag -enable-cond-stores-vec. It will enable vectorization of conditional stores. But there is no cost model for vectorization of conditional stores in place yet so this will not do good at the moment. rdar://15892953 Results for x86-64 -O3 -mavx +/- -mllvm -enable-loadstore-runtime-unroll -vectorize-num-stores-pred=1 (before the BFI change): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.35% (maze3() is identical but 10% slower) Applications/siod/siod 2.18% Performance improvements: mesa -4.42% libquantum -4.15% With a patch that slightly changes the register heuristics (by subtracting the induction variable on both sides of the register pressure equation, as the induction variable is probably not really unrolled): Performance Regressions: Benchmarks/Ptrdist/yacr2/yacr2 7.73% Applications/siod/siod 1.97% Performance Improvements: libquantum -13.05% (we now also unroll quantum_toffoli) mesa -4.27% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200270 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-28 01:01:53 +00:00
Chandler Carruth	5f61e70eac	[vectorize] Initial version of respecting PGO in the vectorizer: treat cold loops as-if they were being optimized for size. Nothing fancy here. Simply test case included. The nice thing is that we can now incrementally build on top of this to drive other heuristics. All of the infrastructure work is done to get the profile information into this layer. The remaining work necessary to make this a fully general purpose loop unroller for very hot loops is to make it a fully general purpose loop unroller. Things I know of but am not going to have time to benchmark and fix in the immediate future: 1) Don't disable the entire pass when the target is lacking vector registers. This really doesn't make any sense any more. 2) Teach the unroller at least and the vectorizer potentially to handle non-if-converted loops. This is trivial for the unroller but hard for the vectorizer. 3) Compute the relative hotness of the loop and thread that down to the various places that make cost tradeoffs (very likely only the unroller makes sense here, and then only when dealing with loops that are small enough for unrolling to not completely blow out the LSD). I'm still dubious how useful hotness information will be. So far, my experiments show that if we can get the correct logic for determining when unrolling actually helps performance, the code size impact is completely unimportant and we can unroll in all cases. But at least we'll no longer burn code size on cold code. One somewhat unrelated idea that I've had forever but not had time to implement: mark all functions which are only reachable via the global constructors rigging in the module as optsize. This would also decrease the impact of any more aggressive heuristics here on code size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200219 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-27 13:11:50 +00:00
Chandler Carruth	1c4746ed70	[vectorizer] Add an override for the target instruction cost and use it to stabilize a test that really is trying to test generic behavior and not a specific target's behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200215 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-27 11:41:50 +00:00
Chandler Carruth	424b2b0093	[vectorizer] Teach the loop vectorizer's unroller to only unroll by powers of two. This is essentially always the correct thing given the impact on alignment, scaling factors that can be used in addressing modes, etc. Also, fix the management of the unroll vs. small loop cost to more accurately model things with this world. Enhance a test case to actually exercise more of the unroll machinery if using synthetic constants rather than a specific target model. Before this change, with the added flags this test will unroll 3 times instead of either 2 or 4 (the two sensible answers). While I don't expect this to make a huge difference, if there are lots of loops sitting right on the edge of hitting the 'small unroll' factor, they might change behavior. However, I've benchmarked moving the small loop cost up and down in many various ways and by a huge factor (2x) without seeing more than 0.2% code size growth. Small adjustments such as the series that led up here have led to about 1% improvement on some benchmarks, but it is very close to the noise floor so I mostly checked that nothing regressed. Let me know if you see bad behavior on other targets but I don't expect this to be a sufficiently dramatic change to trigger anything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200213 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-27 11:12:24 +00:00
Alp Toker	ae43cab6ba	Fix known typos Sweep the codebase for common typos. Includes some changes to visible function names that were misspelt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@200018 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-24 17:20:08 +00:00
Benjamin Kramer	0487faa97b	InstCombine: Teach most integer add/sub/mul/div combines how to deal with vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199602 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-19 15:24:22 +00:00
Arnold Schwaighofer	2becaaf3a1	LoopVectorizer: A reduction that has multiple uses of the reduction value is not a reduction. Really. Under certain circumstances (the use list of an instruction has to be set up right - hence the extra pass in the test case) we would not recognize when a value in a potential reduction cycle was used multiple times by the reduction cycle. Fixes PR18526. radar://15851149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-19 03:18:31 +00:00
Arnold Schwaighofer	e96fec2e43	LoopVectorize: Only strip casts from integer types when replacing symbolic strides Fixes PR18480. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199291 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-15 03:35:46 +00:00
Benjamin Kramer	ccdb9c9483	Fix broken CHECK lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@199016 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-11 21:06:00 +00:00
Arnold Schwaighofer	ee3f7de62e	LoopVectorizer: Handle strided memory accesses by versioning for (i = 0; i < N; ++i) A[i * Stride1] += B[i * Stride2]; We take loops like this and check that the symbolic strides 'Strided1/2' are one and drop to the scalar loop if they are not. This is currently disabled by default and hidden behind the flag 'enable-mem-access-versioning'. radar://13075509 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198950 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-10 18:20:32 +00:00
Arnold Schwaighofer	83196a9fcb	LoopVectorizer: Don't if-convert constant expressions that can trap A phi node operand or an instruction operand could be a constant expression that can trap (division). Check that we don't vectorize such cases. PR16729 radar://15653590 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@197449 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-17 01:11:01 +00:00
Renato Golin	1de8133402	force vector width via cpu on vectorizer metadata enable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196669 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-07 21:46:08 +00:00
Renato Golin	3a6ea481a1	Move test to X86 dir Test is platform independent, but I don't want to force vector-width, or that could spoil the pragma test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196539 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-05 21:45:39 +00:00
Renato Golin	07d9471bc5	Add #pragma vectorize enable/disable to LLVM The intended behaviour is to force vectorization on the presence of the flag (either turn on or off), and to continue the behaviour as expected in its absence. Tests were added to make sure the all cases are covered in opt. No tests were added in other tools with the assumption that they should use the PassManagerBuilder in the same way. This patch also removes the outdated -late-vectorize flag, which was on by default and not helping much. The pragma metadata is being attached to the same place as other loop metadata, but nothing forbids one from attaching it to a function (to enable #pragma optimize) or basic blocks (to hint the basic-block vectorizers), etc. The logic should be the same all around. Patches to Clang to produce the metadata will be produced after the initial implementation is agreed upon and committed. Patches to other vectorizers (such as SLP and BB) will be added once we're happy with the pass manager changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196537 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-05 21:20:02 +00:00
Alp Toker	087ab613f4	Correct word hyphenations This patch tries to avoid unrelated changes other than fixing a few hyphen-related ambiguities and contractions in nearby lines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-05 05:44:44 +00:00
Arnold Schwaighofer	df9c9da884	opt: Mirror vectorization presets of clang clang enables vectorization at optimization levels > 1 and size level < 2. opt should behave similarily. Loop vectorization and SLP vectorization can be disabled with the flags -disable-(loop/slp)-vectorization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196294 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-03 16:33:06 +00:00
Arnold Schwaighofer	b40f14eb89	LoopVectorizer: Truncate i64 trip counts of i32 phis if necessary In signed arithmetic we could end up with an i64 trip count for an i32 phi. Because it is signed arithmetic we know that this is only defined if the i32 does not wrap. It is therefore safe to truncate the i64 trip count to a i32 value. Fixes PR18049. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195787 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-26 22:11:23 +00:00
Manman Ren	bec50063a5	Debug Info: update testing cases to specify the debug info version number. We are going to drop debug info without a version number or with a different version number, to make sure we don't crash when we see bitcode files with different debug info metadata format. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195504 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-22 21:49:45 +00:00
Arnold Schwaighofer	4bc2e3a32d	SLPVectorizer: Fix stale for Value pointer array We are slicing an array of Value pointers and process those slices in a loop. The problem is that we might invalidate a later slice by vectorizing a former slice. Use a WeakVH to track the pointer. If the pointer is deleted or RAUW'ed we can tell. The test case will only fail when running with libgmalloc. radar://15498655 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195162 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-19 22:20:20 +00:00
Arnold Schwaighofer	07a3c481c6	LoopVectorizer: Extend the induction variable to a larger type In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195008 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-18 13:14:32 +00:00
Arnold Schwaighofer	4634338655	LoopVectorizer: Use abi alignment for accesses with no alignment When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194876 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 23:09:33 +00:00
Matt Arsenault	eba6d38448	Scalarize select vector arguments when extracted. When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194013 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-04 20:36:06 +00:00
Arnold Schwaighofer	f4775827d0	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193891 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 22:18:19 +00:00
Benjamin Kramer	7208b0763c	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193860 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 14:09:50 +00:00
Arnold Schwaighofer	0097e15502	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193853 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 03:05:07 +00:00
Arnold Schwaighofer	c04d241d13	ARM cost model: Unaligned vectorized double stores are expensive Updated a test case that assumed that <2 x double> would vectorize to use <4 x float>. radar://15338229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193574 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-29 01:33:57 +00:00
Hal Finkel	006183a936	LoopVectorizer: Don't attempt to vectorize extractelement instructions The loop vectorizer does not currently understand how to vectorize extractelement instructions. The existing check, which excluded all vector-valued instructions, did not catch extractelement instructions because it checked only the return value. As a result, vectorization would proceed, producing illegal instructions like this: %58 = extractelement <2 x i32> %15, i32 0 %59 = extractelement i32 %58, i32 0 where the second extractelement is illegal because its first operand is not a vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193434 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-25 20:40:15 +00:00
Renato Golin	e662fb6083	I had to move and remove git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193355 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-24 16:31:43 +00:00
Renato Golin	93fd763184	Fix broken builds by moving test to x86 dir git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193351 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-24 15:11:03 +00:00
Renato Golin	d6aa89eca5	Mark vector loops as already vectorized Make sure we mark all loops (scalar and vector) when vectorizing, so that we don't try to vectorize them anymore. Also, set unroll to 1, since this is what we check for on early exit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193349 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-24 14:50:51 +00:00

1 2 3 4 5

222 Commits