llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-01-17 06:33:21 +00:00

Author	SHA1	Message	Date
Juergen Ributzka	d3cf783ed1	[Constant Hoisting] Make the constant materialization cost operand dependent Extend the target hook to take also the operand index into account when calculating the cost of the constant materialization. Related to <rdar://problem/16381500> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204435 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-21 06:04:45 +00:00
Juergen Ributzka	8ebaf63750	[Constant Hoisting] Change the algorithm to only track constants for instructions. Originally the algorithm would search for expensive constants and track their users, which could be instructions and constant expressions. This change only tracks the constants for instructions, but constant expressions are indirectly covered too. If an operand is an constant expression, then we look through the expression to find anny expensive constants. The algorithm keep now track of the instruction and the operand index where the constant is used. This allows more precise hoisting of constant materialization code for PHI instructions, because we only hoist to the basic block of the incoming operand. Before we had to find the idom of all PHI operands and hoist the materialization code there. This also makes updating of instructions easier. Before we had to keep track of the original constant, find it in the instructions, and then replace it. Now we can just simply update the operand. Related to <rdar://problem/16381500> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204433 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-21 06:04:36 +00:00
Juergen Ributzka	ee3242ed0b	Revert "[Constant Hoisting] Extend coverage of the constant hoisting pass." I will break this up into smaller pieces for review and recommit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204393 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 20:17:13 +00:00
Juergen Ributzka	228c72a841	[Constant Hoisting] Extend coverage of the constant hoisting pass. This commit extends the coverage of the constant hoisting pass, adds additonal debug output and updates the function names according to the style guide. Related to <rdar://problem/16381500> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204389 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 19:55:52 +00:00
Mark Seaborn	9bb9615e1f	Remove LowerInvoke's obsolete "-enable-correct-eh-support" option This option caused LowerInvoke to generate code using SJLJ-based exception handling, but there is no code left that interprets the jmp_buf stack that the resulting code maintained (llvm.sjljeh.jblist). This option has been obsolete for a while, and replaced by SjLjEHPrepare. This leaves the default behaviour of LowerInvoke, which is to convert invokes to calls. Differential Revision: http://llvm-reviews.chandlerc.com/D3136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204388 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 19:54:47 +00:00
Mark Seaborn	25efa8a8d2	Add a test for LowerInvoke that doesn't use "-enable-correct-eh-support" None of the existing tests for LowerInvoke check LowerInvoke's output, and all but one use "-enable-correct-eh-support", which is obsolete, so those tests will be removed when that option is removed. To make sure LowerInvoke will still have test coverage, this adds a test for its default mode which converts invokes to calls. Differential Revision: http://llvm-reviews.chandlerc.com/D3124 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204344 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-20 14:12:47 +00:00
Duncan P. N. Exon Smith	9ea770ddbb	Fix use_iterator crash in ObjCArc from r203364 The use_iterator redesign in r203364 introduced an increment past the end of a range in -objc-arc-contract. Added an explicit check for the end of the range. <rdar://problem/16333235> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204195 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 22:32:43 +00:00
Diego Novillo	c605296342	Tolerate unmangled names in sample profiles. Summary: The compiler does not always generate linkage names. If a function has been inlined and its body elided, its linkage name may not be generated. When the binary executes, the profiler will use its unmangled name when attributing samples. This results in unmangled names in the input profile. We are currently failing hard when this happens. However, in this case all that happens is that we fail to attribute samples to the inlined function. While this means fewer optimization opportunities, it should not cause a compilation failure. This patch accepts all valid function names, regardless of whether they were mangled or not. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3087 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204142 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-18 12:03:12 +00:00
Dan Gohman	43873d4f73	Use range metadata instead of introducing selects. When GlobalOpt has determined that a GlobalVariable only ever has two values, it would convert the GlobalVariable to a boolean, and introduce SelectInsts at every load, to choose between the two possible values. These SelectInsts introduce overhead and other unpleasantness. This patch makes GlobalOpt just add range metadata to loads from such GlobalVariables instead. This enables the same main optimization (as seen in test/Transforms/GlobalOpt/integer-bool.ll), without introducing selects. The main downside is that it doesn't get the memory savings of shrinking such GlobalVariables, but this is expected to be negligible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204076 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-17 19:57:04 +00:00
NAKAMURA Takumi	8e85e50170	llvm/test/Transforms/SampleProfile/syntax.ll: Suppress checking the message catalog in ENOENT. It is locale-dependent on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-15 02:32:21 +00:00
Diego Novillo	3f7c2f31ff	Use DiagnosticInfo facility. Summary: The sample profiler pass emits several error messages. Instead of just aborting the compiler with report_fatal_error, we can emit better messages using DiagnosticInfo. This adds a new sub-class of DiagnosticInfo to handle the sample profiler. Reviewers: chandlerc, qcolombet CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D3086 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-14 21:58:59 +00:00
Rafael Espindola	1f21e0dd0d	Remove the linker_private and linker_private_weak linkages. These linkages were introduced some time ago, but it was never very clear what exactly their semantics were or what they should be used for. Some investigation found these uses: * utf-16 strings in clang. * non-unnamed_addr strings produced by the sanitizers. It turns out they were just working around a more fundamental problem. For some sections a MachO linker needs a symbol in order to split the section into atoms, and llvm had no idea that was the case. I fixed that in r201700 and it is now safe to use the private linkage. When the object ends up in a section that requires symbols, llvm will use a 'l' prefix instead of a 'L' prefix and things just work. With that, these linkages were already dead, but there was a potential future user in the objc metadata information. I am still looking at CGObjcMac.cpp, but at this point I am convinced that linker_private and linker_private_weak are not what they need. The objc uses are currently split in * Regular symbols (no '\01' prefix). LLVM already directly provides whatever semantics they need. * Uses of a private name (start with "\01L" or "\01l") and private linkage. We can drop the "\01L" and "\01l" prefixes as soon as llvm agrees with clang on L being ok or not for a given section. I have two patches in code review for this. * Uses of private name and weak linkage. The last case is the one that one could think would fit one of these linkages. That is not the case. The semantics are * the linker will merge these symbol by name. * the linker will hide them in the final DSO. Given that the merging is done by name, any of the private (or internal) linkages would be a bad match. They allow llvm to rename the symbols, and that is really not what we want. From the llvm point of view, these objects should really be (linkonce\|weak)(_odr)?. For now, just keeping the "\01l" prefix is probably the best for these symbols. If we one day want to have a more direct support in llvm, IMHO what we should add is not a linkage, it is just a hidden_symbol attribute. It would be applicable to multiple linkages. For example, on weak it would produce the current behavior we have for objc metadata. On internal, it would be equivalent to private (and we should then remove private). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203866 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 23:18:37 +00:00
Owen Anderson	d3fc1be4f6	Fix a bug in InstCombine where we would incorrectly attempt to construct a bitcast between pointers of two different address spaces if they happened to have the same pointer size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 22:51:43 +00:00
Manuel Jacob	f8909fa140	CodeGenPrep: sink extends of illegal types into use block. Summary: This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. This is an update of D2973 which was reverted because of a bug reported as PR19084. Reviewers: t.p.northover, chapuni Reviewed By: t.p.northover CC: llvm-commits, alex, chapuni Differential Revision: http://llvm-reviews.chandlerc.com/D3021 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203797 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 13:36:25 +00:00
Karthik Bhat	178df3f1bb	Fix PR18800. llvm intrinsic memcpy takes 5 arguments void @llvm.memcpy.p0i8.p0i8.i32(i8* <dest>, i8* <src>, i32 <len>, i32 <align>, i1 <isvolatile>).The test case incorrectly uses the old format resulting in isVolatile function in MemIntrinsic to crash during SROA transformation.Modified the test case to use correct signature of memcpy and memset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203750 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-13 04:50:29 +00:00
Rafael Espindola	eb8eef0b3f	This test need the X86 backend, move it to the X86 sub directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203725 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 22:03:43 +00:00
Michael Zolotukhin	4a0593ccd3	PR17473: Don't normalize an expression during postinc transformation unless it's invertible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203719 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 21:31:05 +00:00
Raul E. Silvera	230eda4bdf	Resubmit "[SLPV] Recognize vectorizable intrinsics during SLP vectorization ..." This reverts commit 86cb795388643710dab34941ddcb5a9470ac39d8. The problems previously found have been resolved through other CLs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203707 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:21:50 +00:00
Rafael Espindola	38048cdb1c	Reject alias to undefined symbols in the verifier. On ELF and COFF an alias is just another name for a position in the file. There is no way to refer to a position in another file, so an alias to undefined is meaningless. MachO currently doesn't support aliases. The spec has a N_INDR, which when implemented will have a different set of restrictions. Adding support for it shouldn't be harder than any other IR extension. For now, having the IR represent what is actually possible with current tools makes it easier to fix the design of GlobalAlias. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 20:15:49 +00:00
Hans Wennborg	09a31f3154	Allow switch-to-lookup table for tables with holes by adding bitmask check This allows us to generate table lookups for code such as: unsigned test(unsigned x) { switch (x) { case 100: return 0; case 101: return 1; case 103: return 2; case 105: return 3; case 107: return 4; case 109: return 5; case 110: return 6; default: return f(x); } } Since cases 102, 104, etc. are not constants, the lookup table has holes in those positions. We therefore guard the table lookup with a bitmask check. Patch by Jasper Neumann! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203694 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:35:40 +00:00
Evan Cheng	9225686155	Revert r203488 and r203520. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203687 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-12 18:09:37 +00:00
Erik Verbruggen	0638609d66	Fix crash in PRE. After r203553 overflow intrinsics and their non-intrinsic (normal) instruction get hashed to the same value. This patch prevents PRE from moving an instruction into a predecessor block, and trying to add a phi node that gets two different types (the intrinsic result and the non-intrinsic result), resulting in a failing assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 15:07:32 +00:00
Tim Northover	ca396e391e	IR: add a second ordering operand to cmpxhg for failure The syntax for "cmpxchg" should now look something like: cmpxchg i32* %addr, i32 42, i32 3 acquire monotonic where the second ordering argument gives the required semantics in the case that no exchange takes place. It should be no stronger than the first ordering constraint and cannot be either "release" or "acq_rel" (since no store will have taken place). rdar://problem/15996804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 10:48:52 +00:00
Erik Verbruggen	cf5240642b	GVN: merge overflow intrinsics with non-overflow instructions. When an overflow intrinsic is followed by a non-overflow instruction, replace the latter with an extract. For example: %sadd = tail call { i32, i1 } @llvm.sadd.with.overflow.i32(i32 %a, i32 %b) %sadd3 = add i32 %a, %b Here the add statement will be replaced by an extract. When an overflow intrinsic follows a non-overflow instruction, a clone of the intrinsic is inserted before the normal instruction, which makes it the same as the previous case. Subsequent runs of GVN can then clean up the duplicate instructions and insert the extract. This fixes PR8817. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203553 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-11 09:36:48 +00:00
Diego Novillo	87393cfd6b	Use discriminator information in sample profiles. Summary: When the sample profiles include discriminator information, use the discriminator values to distinguish instruction weights in different basic blocks. This modifies the BodySamples mapping to map <line, discriminator> pairs to weights. Instructions on the same line but different blocks, will use different discriminator values. This, in turn, means that the blocks may have different weights. Other changes in this patch: - Add tests for positive values of line offset, discriminator and samples. - Change data types from uint32_t to unsigned and int and do additional validation. Reviewers: chandlerc CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D2857 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203508 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 22:41:28 +00:00
Benjamin Kramer	8da0b7358d	MemCpyOpt: When merging memsets also merge the trivial case of two memsets with the same destination. The testcase is from PR19092, but I think the bug described there is actually a clang issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 21:05:13 +00:00
Evan Cheng	d89b0f200c	For functions with ARM target specific calling convention, when simplify-libcall optimize a call to a llvm intrinsic to something that invovles a call to a C library call, make sure it sets the right calling convention on the call. e.g. extern double pow(double, double); double t(double x) { return pow(10, x); } Compiles to something like this for AAPCS-VFP: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %0 = call double @llvm.pow.f64(double 1.000000e+01, double %x) ret double %0 } declare double @llvm.pow.f64(double, double) #1 Simplify libcall (part of instcombine) will turn the above into: define arm_aapcs_vfpcc double @t(double %x) #0 { entry: %__exp10 = call double @__exp10(double %x) #1 ret double %__exp10 } declare double @__exp10(double) The pre-instcombine code works because calls to LLVM builtins are special. Instruction selection will chose the right calling convention for the call. However, the code after instcombine is wrong. The call to __exp10 will use the C calling convention. I can think of 3 options to fix this. 1. Make "C" calling convention just work since the target should know what CC is being used. This doesn't work because each function can use different CC with the "pcs" attribute. 2. Have Clang add the right CC keyword on the calls to LLVM builtin. This will work but it doesn't match the LLVM IR specification which states these are "Standard C Library Intrinsics". 3. Fix simplify libcall so the resulting calls to the C routines will have the proper CC keyword. e.g. %__exp10 = call arm_aapcs_vfpcc double @__exp10(double %x) #1 This works and is the solution I implemented here. Both solutions #2 and #3 would work. After carefully considering the pros and cons, I decided to implement #3 for the following reasons. 1. It doesn't change the "spec" of the intrinsics. 2. It's a self-contained fix. There are a couple of potential downsides. 1. There could be other places in the optimizer that is broken in the same way that's not addressed by this. 2. There could be other calling conventions that need to be propagated by simplify-libcall that's not handled. But for now, this is the fix that I'm most comfortable with. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203488 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-10 20:49:45 +00:00
NAKAMURA Takumi	e086782817	Revert r203230, "CodeGenPrep: sink extends of illegal types into use block." It choked i686 stage2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203386 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-09 11:01:07 +00:00
David Majnemer	39a09d2b7c	IR: Change inalloca's grammar a bit The grammar for LLVM IR is not well specified in any document but seems to obey the following rules: - Attributes which have parenthesized arguments are never preceded by commas. This form of attribute is the only one which ever has optional arguments. However, not all of these attributes support optional arguments: 'thread_local' supports an optional argument but 'addrspace' does not. Interestingly, 'addrspace' is documented as being a "qualifier". What constitutes a qualifier? I cannot find a definition. - Some attributes use a space between the keyword and the value. Examples of this form are 'align' and 'section'. These are always preceded by a comma. - Otherwise, the attribute has no argument. These attributes do not have a preceding comma. Sometimes an attribute goes before the instruction, between the instruction and it's type, or after it's type. 'atomicrmw' has 'volatile' between the instruction and the type while 'call' has 'tail' preceding the instruction. With all this in mind, it seems most consistent for 'inalloca' on an 'inalloca' instruction to occur before between the instruction and the type. Unlike the current formulation, there would be no preceding comma. The combination 'alloca inalloca' doesn't look particularly appetizing, perhaps a better spelling of 'inalloca' is down the road. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203376 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-09 06:41:58 +00:00
Tim Northover	fa9e4b52f4	CodeGenPrep: sink extends of illegal types into use block. This helps the instruction selector to lower an i64 * i64 -> i128 multiplication into a single instruction on targets which support it. Patch by Manuel Jacob. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203230 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-07 11:04:30 +00:00
Tim Northover	69d2b2aa5a	InstCombine: form shuffles from wider range of insert/extractelements Sequences of insertelement/extractelements are sometimes used to build vectorsr; this code tries to put them back together into shuffles, but could only produce a completely uniform shuffle types (<N x T> from two <N x T> sources). This should allow shuffles with different numbers of elements on the input and output sides as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203229 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-07 10:24:44 +00:00
Karthik Bhat	70957b9c55	Allow constant folding of round function whenever feasible git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203198 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-07 04:36:21 +00:00
Karthik Bhat	df95a94064	Allow constant folding of copysign git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203076 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-06 05:32:52 +00:00
Raul E. Silvera	8b6d60f94b	Change math intrinsic attributes from readonly to readnone. These are operations that do not access memory but may be sensitive to floating-point environment changes. LLVM does not attempt to model FP environment changes, so this was unnecessarily conservative and was getting on the way of some optimizations, in particular SLP vectorization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-06 00:18:15 +00:00
Arnold Schwaighofer	9d84b4d70c	LoopVectorizer: Preserve fast-math flags Fixes PR19045. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203008 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 21:10:47 +00:00
Benjamin Kramer	4d36f91c08	ConstantFolding: Also fold the vector overloads of our math intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 19:41:48 +00:00
Raul E. Silvera	7e8a5404fc	Trivial test commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202924 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 02:09:51 +00:00
Matt Arsenault	34fae3adef	Allow constant folding of fma and fmuladd git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202914 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-05 00:02:00 +00:00
Diego Novillo	f05b45fdb2	Pass to emit DWARF path discriminators. DWARF discriminators are used to distinguish multiple control flow paths on the same source location. When this happens, instructions across basic block boundaries will share the same debug location. This pass detects this situation and creates a new lexical scope to one of the two instructions. This lexical scope is a child scope of the original and contains a new discriminator value. This discriminator is then picked up from MCObjectStreamer::EmitDwarfLocDirective to be written on the object file. This fixes http://llvm.org/bugs/show_bug.cgi?id=18270. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202752 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-03 20:06:11 +00:00
Eric Christopher	0181303087	Add a debug info code generation level to the compile unit metadata and update everything accordingly. This can be used to conditionalize the amount of output in the backend based on the amount of debug requested/metadata emission scheme by a front end (e.g. clang). Paired with a commit to clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202332 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-27 01:24:56 +00:00
Nico Rieck	06eab82006	Fix broken FileCheck prefixes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202308 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 22:29:11 +00:00
Reid Kleckner	726bae9a66	GlobalOpt: Apply fastcc to internal x86_thiscallcc functions We should apply fastcc whenever profitable. We can expand this list, but there are lots of conventions with performance implications that we don't want to change. Differential Revision: http://llvm-reviews.chandlerc.com/D2705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202293 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 19:57:30 +00:00
Nico Rieck	5732fbd6e4	Fix broken FileCheck prefix git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202291 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 19:51:08 +00:00
Andrew Trick	401d35bedb	Fix PR18165: LSR must avoid scaling factors that exceed the limit on truncated use. Patch by Michael Zolotukhin! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202273 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 16:31:56 +00:00
Chandler Carruth	2b442bffcb	[SROA] Use the correct index integer size in GEPs through non-default address spaces. This isn't really a correctness issue (the values are truncated) but its much cleaner. Patch by Matt Arsenault! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202252 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 10:08:16 +00:00
Chandler Carruth	5b95cec37c	[SROA] Teach SROA how to handle pointers from address spaces other than the default. Based on the patch by Matt Arsenault, D1764! I switched one place to use the more direct pointer type to compute the desired address space, and I reworked the memcpy rewriting section to reflect significant refactorings that this patch helped inspire. Thanks to several of the folks who helped review and improve the patch as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202247 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 08:25:02 +00:00
Chandler Carruth	38e90e3de1	[SROA] Split the alignment computation complete for the memcpy rewriting to work independently for the slice side and the other side. This allows us to only compute the minimum of the two when we actually rewrite to a memcpy that needs to take the minimum, and preserve higher alignment for one side or the other when rewriting to loads and stores. This fix was inspired by seeing the result of some refactoring that makes addrspace handling better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202242 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 07:29:54 +00:00
Chandler Carruth	50bc165c54	[SROA] Fix PR18615 with some long overdue simplifications to the bounds checking in SROA. The primary change is to just rely on uge for checking that the offset is within the allocation size. This removes the explicit checks against isNegative which were terribly error prone (including the reversed logic that led to PR18615) and prevented us from supporting stack allocations larger than half the address space.... Ok, so maybe the latter isn't common but it's a silly restriction to have. Also, we used to try to support a PHI node which loaded from before the start of the allocation if any of the loaded bytes were within the allocation. This doesn't make any sense, we have never really supported loading or storing before the allocation starts. The simplified logic just doesn't care. We continue to allow loading past the end of the allocation in part to support cases where there is a PHI and some loads are larger than others and the larger ones reach past the end of the allocation. We could solve this a different and more conservative way, but I'm still somewhat paranoid about this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202224 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-26 03:14:14 +00:00
Chandler Carruth	c537759a7f	[SROA] Fix another instability in SROA with respect to the slice ordering. The fundamental problem that we're hitting here is that the use-def chain ordering is itself not a stable thing to be relying on in the rewriting for SROA. Further, we use a non-stable sort over the slices to arrange them based on the section of the alloca they're operating on. With a debugging STL implementation (or different implementations in stage2 and stage3) this can cause stage2 != stage3. The specific aspect of this problem fixed in this commit deals with the rewriting and load-speculation around PHIs and Selects. This, like many other aspects of the use-rewriting in SROA, is really part of the "strong SSA-formation" that is doen by SROA where it works very hard to canonicalize loads and stores in just the right way to satisfy the needs of mem2reg[1]. When we have a select (or a PHI) with 2 uses of the same alloca, we test that loads downstream of the select are speculatable around it twice. If only one of the operands to the select needs to be rewritten, then if we get lucky we rewrite that one first and the select is immediately speculatable. This can cause the order of operand visitation, and thus the order of slices to be rewritten, to change an alloca from promotable to non-promotable and vice versa. The fix is to defer all of the speculation until after the rewrite phase is done. Once we've rewritten everything, we can accurately test for whether speculation will work (once, instead of twice!) and the order ceases to matter. This also happens to simplify the other subtlety of speculation -- we need to not speculate anything unless the result of speculating will make the alloca fully promotable by mem2reg. I had a previous attempt at simplifying this, but it was still pretty horrible. There is actually already a really nice test case for this in basictest.ll, but on multiple STL implementations and inputs, we just got "lucky". Fortunately, the test case is very small and we can essentially build it in exactly the opposite way to get reasonable coverage in both directions even from normal STL implementations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202092 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-25 00:07:09 +00:00
Arnold Schwaighofer	68085c7bef	SLPVectorizer: Try vectorizing 'splat' stores Vectorize sequential stores of a broadcasted value. 5% on eon. radar://16124699 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@202067 91177308-0d34-0410-b5e6-96231b3b80d8	2014-02-24 19:52:29 +00:00

1 2 3 4 5 ...

5538 Commits