llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

Author	SHA1	Message	Date
Arnold Schwaighofer	07a3c481c6	LoopVectorizer: Extend the induction variable to a larger type In some case the loop exit count computation can overflow. Extend the type to prevent most of those cases. The problem is loops like: int main () { int a = 1; char b = 0; lbl: a &= 4; b--; if (b) goto lbl; return a; } The backedge count is 255. The induction variable type is i8. If we add one to 255 to get the exit count we overflow to zero. To work around this issue we extend the type of the induction variable to i32 in the case of i8 and i16. PR17532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195008 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-18 13:14:32 +00:00
NAKAMURA Takumi	80ccd9ea59	Utils/LoopUnroll.cpp: Tweak (StringRef)OldName to be valid until it is used, since r194601. eraseFromParent() invalidates OldName. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194970 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-17 18:05:34 +00:00
Hal Finkel	c8dc96be28	Add a loop rerolling flag to the PassManagerBuilder This adds a boolean member variable to the PassManagerBuilder to control loop rerolling (just like we have for unrolling and the various vectorization options). This is necessary for control by the frontend. Loop rerolling remains disabled by default at all optimization levels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194966 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-17 16:02:50 +00:00
Hal Finkel	390564206f	Add the cold attribute to error-reporting call sites Generally speaking, control flow paths with error reporting calls are cold. So far, error reporting calls are calls to perror and calls to fprintf, fwrite, etc. with stderr as the stream. This can be extended in the future. The primary motivation is to improve block placement (the cold attribute affects the static branch prediction heuristics). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194943 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-17 02:06:35 +00:00
Hal Finkel	b7dabccbce	Fix ndebug-build unused variable in loop rerolling git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194941 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-17 01:21:54 +00:00
Hal Finkel	bebe48dbfe	Add a loop rerolling pass This adds a loop rerolling pass: the opposite of (partial) loop unrolling. The transformation aims to take loops like this: for (int i = 0; i < 3200; i += 5) { a[i] += alpha * b[i]; a[i + 1] += alpha * b[i + 1]; a[i + 2] += alpha * b[i + 2]; a[i + 3] += alpha * b[i + 3]; a[i + 4] += alpha * b[i + 4]; } and turn them into this: for (int i = 0; i < 3200; ++i) { a[i] += alpha * b[i]; } and loops like this: for (int i = 0; i < 500; ++i) { x[3i] = foo(0); x[3i+1] = foo(0); x[3*i+2] = foo(0); } and turn them into this: for (int i = 0; i < 1500; ++i) { x[i] = foo(0); } There are two motivations for this transformation: 1. Code-size reduction (especially relevant, obviously, when compiling for code size). 2. Providing greater choice to the loop vectorizer (and generic unroller) to choose the unrolling factor (and a better ability to vectorize). The loop vectorizer can take vector lengths and register pressure into account when choosing an unrolling factor, for example, and a pre-unrolled loop limits that choice. This is especially problematic if the manual unrolling was optimized for a machine different from the current target. The current implementation is limited to single basic-block loops only. The rerolling recognition should work regardless of how the loop iterations are intermixed within the loop body (subject to dependency and side-effect constraints), but the significant restriction is that the order of the instructions in each iteration must be identical. This seems sufficient to capture all current use cases. This pass is not currently enabled by default at any optimization level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194939 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-16 23:59:05 +00:00
Hal Finkel	64fa501b10	Apply the InstCombine fptrunc sqrt optimization to llvm.sqrt InstCombine, in visitFPTrunc, applies the following optimization to sqrt calls: (fptrunc (sqrt (fpext x))) -> (sqrtf x) but does not apply the same optimization to llvm.sqrt. This is a problem because, to enable vectorization, Clang generates llvm.sqrt instead of sqrt in fast-math mode, and because this optimization is being applied to sqrt and not applied to llvm.sqrt, sometimes the fast-math code is slower. This change makes InstCombine apply this optimization to llvm.sqrt as well. This fixes the specific problem in PR17758, although the same underlying issue (optimizations applied to libcalls are not applied to intrinsics) exists for other optimizations in SimplifyLibCalls. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194935 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-16 21:29:08 +00:00
Benjamin Kramer	e9cdbf68e5	InstCombine: fold (A >> C) == (B >> C) --> (A^B) < (1 << C) for constant Cs. This is common in bitfield code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194925 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-16 16:00:48 +00:00
Arnold Schwaighofer	4634338655	LoopVectorizer: Use abi alignment for accesses with no alignment When we vectorize a scalar access with no alignment specified, we have to set the target's abi alignment of the scalar access on the vectorized access. Using the same alignment of zero would be wrong because most targets will have a bigger abi alignment for vector types. This probably fixes PR17878. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194876 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 23:09:33 +00:00
Manman Ren	c160efc28b	ArgumentPromotion: correctly transfer TBAA tags and alignments. We used to use std::map<IndicesVector, LoadInst> for OriginalLoads, and when we try to promote two arguments, they will both write to OriginalLoads causing created loads for the two arguments to have the same original load. And the same tbaa tag and alignment will be put to the created loads for the two arguments. The fix is to use std::map<std::pair<Argument, IndicesVector>, LoadInst*> for OriginalLoads, so each Argument will write to different parts of the map. PR17906 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194846 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 20:41:15 +00:00
Kostya Serebryany	8f15c68222	[asan] use GlobalValue::PrivateLinkage for coverage guard to save quite a bit of code size git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194800 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 09:52:05 +00:00
Bob Wilson	4b8991424a	Reapply "[asan] Poor man's coverage that works with ASan" I was able to successfully run a bootstrapped LTO build of clang with r194701, so this change does not seem to be the cause of our failing buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194789 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 07:16:09 +00:00
Matt Arsenault	6dd44d3b7f	Add instcombine visitor for addrspacecast git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194786 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 05:45:08 +00:00
Bob Wilson	2475da80ed	Revert "[asan] Poor man's coverage that works with ASan" This reverts commit 194701. Apple's bootstrapped LTO builds have been failing, and this change (along with compiler-rt 194702-194704) is the only thing on the blamelist. I will either reappy these changes or help debug the problem, depending on whether this fixes the buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194780 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-15 03:28:22 +00:00
Kostya Serebryany	8cc5f7cd59	[asan] Poor man's coverage that works with ASan git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194701 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-14 13:27:41 +00:00
Evgeniy Stepanov	34432aeb6d	[msan] Fast path optimization for wrap-indirect-calls feature of MemorySanitizer. Indirect call wrapping helps MSanDR (dynamic instrumentation companion tool for MSan) to catch all cases where execution leaves a compiler-instrumented module by allowing the tool to rewrite targets of indirect calls. This change is an optimization that skips wrapping for calls when target is inside the current module. This relies on the linker providing symbols at the begin and end of the module code (or code + data, does not really matter). Gold linker provides such symbols by default. GNU (BFD) linker needs a link flag: -Wl,--defsym=__executable_start=0. More info: https://code.google.com/p/memory-sanitizer/wiki/MSanDR#Native_exec git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194697 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-14 12:29:04 +00:00
Jakub Staszak	a305ffb65b	Use StringRef instead of std::string git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194601 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-13 20:09:11 +00:00
Alexey Samsonov	4223b96010	Fix -Wdelete-non-virtual-dtor warnings by making SampleProfile methods non-virtual git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194568 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-13 13:09:39 +00:00
Diego Novillo	563b29f8db	SampleProfileLoader pass. Initial setup. This adds a new scalar pass that reads a file with samples generated by 'perf' during runtime. The samples read from the profile are incorporated and emmited as IR metadata reflecting that profile. The profile file is assumed to have been generated by an external profile source. The profile information is converted into IR metadata, which is later used by the analysis routines to estimate block frequencies, edge weights and other related data. External profile information files have no fixed format, each profiler is free to define its own. This includes both the on-disk representation of the profile and the kind of profile information stored in the file. A common kind of profile is based on sampling (e.g., perf), which essentially counts how many times each line of the program has been executed during the run. The SampleProfileLoader pass is organized as a scalar transformation. On startup, it reads the file given in -sample-profile-file to determine what kind of profile it contains. This file is assumed to contain profile information for the whole application. The profile data in the file is read and incorporated into the internal state of the corresponding profiler. To facilitate testing, I've organized the profilers to support two file formats: text and native. The native format is whatever on-disk representation the profiler wants to support, I think this will mostly be bitcode files, but it could be anything the profiler wants to support. To do this, every profiler must implement the SampleProfile::loadNative() function. The text format is mostly meant for debugging. Records are separated by newlines, but each profiler is free to interpret records as it sees fit. Profilers must implement the SampleProfile::loadText() function. Finally, the pass will call SampleProfile::emitAnnotations() for each function in the current translation unit. This function needs to translate the loaded profile into IR metadata, which the analyzer will later be able to use. This patch implements the first steps towards the above design. I've implemented a sample-based flat profiler. The format of the profile is fairly simplistic. Each sampled function contains a list of relative line locations (from the start of the function) together with a count representing how many samples were collected at that line during execution. I generate this profile using perf and a separate converter tool. Currently, I have only implemented a text format for these profiles. I am interested in initial feedback to the whole approach before I send the other parts of the implementation for review. This patch implements: - The SampleProfileLoader pass. - The base ExternalProfile class with the core interface. - A SampleProfile sub-class using the above interface. The profiler generates branch weight metadata on every branch instructions that matches the profiles. - A text loader class to assist the implementation of SampleProfile::loadText(). - Basic unit tests for the pass. Additionally, the patch uses profile information to compute branch weights based on instruction samples. This patch converts instruction samples into branch weights. It does a fairly simplistic conversion: Given a multi-way branch instruction, it calculates the weight of each branch based on the maximum sample count gathered from each target basic block. Note that this assignment of branch weights is somewhat lossy and can be misleading. If a basic block has more than one incoming branch, all the incoming branches will get the same weight. In reality, it may be that only one of them is the most heavily taken branch. I will adjust this assignment in subsequent patches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194566 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-13 12:22:21 +00:00
Nadav Rotem	0d833348c2	Update the docs to match the function name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194537 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-13 01:12:01 +00:00
Nadav Rotem	6c84f7ad2d	Fold (iszero(A&K1) \| iszero(A&K2)) -> (A&(K1\|K2)) != (K1\|K2) if we know that K1 and K2 are 'one-hot' (only one bit is on). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194525 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 22:38:59 +00:00
Nadav Rotem	f3bd3ea3fe	FoldBranchToCommonDest merges branches into a single branch with or/and of the condition. It has a heuristics for estimating when some of the dependencies are processed by out-of-order processors. This patch adds another rule to the heuristics that says that if the "BonusInstruction" that we speculatively execute is used by the condition of the second branch then it is okay to hoist it. This change exposes more opportunities for other passes to transform the code. It does not matter that much that we if-convert the code because the selectiondag builder splits or/and branches into multiple branches when profitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194524 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 22:37:16 +00:00
Rafael Espindola	46456f6a2f	Corruptly merge constants with explicit and implicit alignments. Constant merge can merge a constant with implicit alignment with one that has explicit alignment. Before this change it was assuming that the explicit alignment was higher than the implicit one, causing the result to be under aligned in some cases. Fixes pr17815. Patch by Chris Smowton! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194506 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 20:21:43 +00:00
Benjamin Kramer	f681437cb0	SimplifyCFG: Use existing constant folding logic when forming switch tables. Both simpler and more powerful than the hand-rolled folding logic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194475 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 12:24:36 +00:00
Shuxin Yang	e26299d76e	Correct a glitch in r194424 which may invalidate iterator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194457 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 08:33:03 +00:00
Yuchen Wu	f42264e7e4	llvm-cov: Added call to update run/program counts. Also updated test files that were generated from this change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194453 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-12 04:59:08 +00:00
Shuxin Yang	6c7a7c6474	Fix PR17952. The symptom is that an assertion is triggered. The assertion was added by me to detect the situation when value is propagated from dead blocks. (We can certainly get rid of assertion; it is safe to do so, because propagating value from dead block to alive join node is certainly ok.) The root cause of this bug is : edge-splitting is conducted on the fly, the edge being split could be a dead edge, therefore the block that split the critial edge needs to be flagged "dead" as well. There are 3 ways to fix this bug: 1) Get rid of the assertion as I mentioned eariler 2) When an dead edge is split, flag the inserted block "dead". 3) proactively split the critical edges connecting dead and live blocks when new dead blocks are revealed. This fix go for 3) with additional 2 LOC. Testing case was added by Rafael the other day. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194424 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-11 22:00:23 +00:00
Renato Golin	4921d5b0a9	Move debug message in vectorizer No functional change, just better reporting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194388 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-11 16:27:35 +00:00
Evgeniy Stepanov	4590b8c090	[msan] Propagate origin for insertvalue, extractvalue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194374 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-11 13:37:10 +00:00
Bill Wendling	855c29d82c	Revert "Resurrect r191017 " GVN proceeds in the presence of dead code" plus a fix to PR17307 & 17308." This causes PR17852. This reverts commit `d93e8a06b2`. Conflicts: test/Transforms/GVN/cond_br2.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194348 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-10 07:34:34 +00:00
Matt Arsenault	6d9e013447	Use type form of getIntPtrType. This should be inconsequential and is work towards removing the default address space arguments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194347 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-10 04:46:57 +00:00
Nadav Rotem	30150a128c	SimplifyCFG has a heuristics for out-of-order processors that decides when it is worthwhile to merge branches. It tries to estimate if the operands of the instruction that we want to hoist are ready. This commit marks function arguments as 'ready' because they require no calculation. This boosts libquantum and a few other workloads from the testsuite. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194346 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-10 04:13:31 +00:00
Matt Arsenault	432bdf6571	Teach MergeFunctions about address spaces git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194342 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-10 01:44:37 +00:00
Hal Finkel	ab09d1e0ea	Remove dead code from LoopUnswitch LoopUnswitch's code simplification routine has logic to convert conditional branches into unconditional branches, after unswitching makes the condition constant, and then remove any blocks that renders dead. Unfortunately, this code is dead, currently broken, and furthermore, has never been alive (at least as far back at 2006). No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194277 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-08 19:58:21 +00:00
Michael Gottesman	f23af8bfd8	[objc-arc] Convert the one directional retain/release relation assert to a conditional check + fail. Due to the previously added overflow checks, we can have a retain/release relation that is one directional. This occurs specifically when we run into an additive overflow causing us to drop state in only one direction. If that occurs, we should bail and not optimize that retain/release instead of asserting. Apologies for the size of the testcase. It is necessary to cause the additive cfg overflow to trigger. rdar://15377890 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194083 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-05 16:02:40 +00:00
Hal Finkel	c88eb08d02	Add a runtime unrolling parameter to the LoopUnroll pass constructor As with the other loop unrolling parameters (the unrolling threshold, partial unrolling, etc.) runtime unrolling can now also be controlled via the constructor. This will be necessary for moving non-trivial unrolling late in the pass manager (after loop vectorization). No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194027 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-05 00:08:03 +00:00
Shuxin Yang	6f744ee498	Remove dead code git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194017 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-04 21:44:01 +00:00
Benjamin Kramer	63d8f88686	SLPVectorizer: Use properlyDominates to satisfy the irreflexivity of a strict weak ordering. STL debug mode checks this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194015 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-04 21:34:55 +00:00
Matt Arsenault	eba6d38448	Scalarize select vector arguments when extracted. When the elements are extracted from a select on vectors or a vector select, do the select on the extracted scalars from the input if there is only one use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@194013 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-04 20:36:06 +00:00
Benjamin Kramer	ec346c1314	SLPVectorizer: Add a missing pair of parens. No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193958 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-03 12:54:32 +00:00
Benjamin Kramer	0c7ba3cef2	SLPVectorizer: When CSEing generated gathers only scan blocks containing them. Instead of doing a RPO traversal of the whole function remember the blocks containing gathers (typically <= 2) and scan them in dominator-first order. The actual CSE is still quadratic, but I'm not confident that adding a scoped hash table here is worth it as we're only looking at the generated instructions and not arbitrary code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193956 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-03 12:27:52 +00:00
David Majnemer	16d1098718	Revert "Inliner: Handle readonly attribute per argument when adding memcpy" This reverts commit r193356, it caused PR17781. A reduced test case covering this regression has been added to the test suite. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193955 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-03 12:22:13 +00:00
David Majnemer	42864070b0	Spell "Actual" correctly git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193954 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-03 11:09:39 +00:00
Bob Wilson	208130f113	Convert calls to __sinpi and __cospi into __sincospi_stret This adds an SimplifyLibCalls case which converts the special __sinpi and __cospi (float & double variants) into a __sincospi_stret where appropriate to remove duplicated work. Patch by Tim Northover git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193943 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-03 06:48:38 +00:00
Benjamin Kramer	9bbc7b4e49	SLPVectorizer: Remove duplicated function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193927 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-02 14:46:27 +00:00
Benjamin Kramer	ff566d8f44	LoopVectorize: Remove quadratic behavior the local CSE. Doing this with a hash map doesn't change behavior and avoids calling isIdenticalTo O(n^2) times. This should probably eventually move into a utility class shared with EarlyCSE and the limited CSE in the SLPVectorizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193926 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-02 13:39:00 +00:00
Arnold Schwaighofer	bc28e88a28	LoopVectorizer: Move cse code into its own function git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193895 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 23:28:54 +00:00
Arnold Schwaighofer	f4775827d0	LoopVectorizer: Perform redundancy elimination on induction variables When the loop vectorizer was part of the SCC inliner pass manager gvn would run after the loop vectorizer followed by instcombine. This way redundancy (multiple uses) were removed and instcombine could perform scalarization on the induction variables. Having moved the loop vectorizer to later we no longer run any form of redundancy elimination before we perform instcombine. This caused vectorized induction variables to survive that did not before. On a recent iMac this helps linpack back from 6000Mflops to 7000Mflops. This should also help lpbench and paq8p. I ran a Release (without Asserts) build over the test-suite and did not see any negative impact on compile time. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193891 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 22:18:19 +00:00
Benjamin Kramer	7208b0763c	LoopVectorize: Look for consecutive acces in GEPs with trailing zero indices If we have a pointer to a single-element struct we can still build wide loads and stores to it (if there is no padding). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193860 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 14:09:50 +00:00
Arnold Schwaighofer	0097e15502	LoopVectorizer: If dependency checks fail try runtime checks When a dependence check fails we can still try to vectorize loops with runtime array bounds checks. This helps linpack to vectorize a loop in dgefa. And we are back to 2x of the scalar performance on a corei7-avx. radar://15339680 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193853 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-01 03:05:07 +00:00

1 2 3 4 5 ...

10915 Commits