llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-25 19:29:53 +00:00

Author	SHA1	Message	Date
Nick Lewycky	b69f873ee1	Fix typo in comment from r218733 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218739 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 03:37:34 +00:00
Chandler Carruth	9e2fe46484	[x86] Teach the new vector shuffle lowering to be even more aggressive in exposing the scalar value to the broadcast DAG fragment so that we can catch even reloads and fold them into the broadcast. This is somewhat magical I'm afraid but seems to work. It is also what the old lowering did, and I've switched an old test to run both lowerings demonstrating that we get the same result. Unlike the old code, I'm not lowering f32 or f64 scalars through this path when we only have AVX1. The target patterns include pretty heinous code to re-cast those as shuffles when the scalar happens to not be spilled because AVX1 provides no broadcast mechanism from registers what-so-ever. This is terribly brittle. I'd much rather go through our generic lowering code to get this. If needed, we can add a peephole to get even more opportunities to broadcast-from-spill-slots that are exposed post-RA, but my suspicion is this just doesn't matter that much. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 03:19:43 +00:00
Chandler Carruth	429670f0e8	[x86] Hoist the zext-lowering up in the v4i32 lowering routine -- it is the same speed as pshufd but we can fold loads into the pmovzx instructions. This fixes some regressions that came up in the regression test suite for the new vector shuffle lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218733 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 02:25:54 +00:00
Adam Nemet	d0d5b08fbd	[AVX512] Remove space before \t in AsmStrings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218725 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 00:41:32 +00:00
Chandler Carruth	afe75172b1	[x86] Teach the new vector shuffle lowering about VBROADCAST and VPBROADCAST. This has the somewhat expected pervasive impact. I don't know why I forgot about this. Everything seems good with lots of significant improvements in the tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218724 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 00:41:21 +00:00
Sanjay Patel	cafc85bf1e	Split the estimate() interface into separate functions for each type. NFC. It was hacky to use an opcode as a switch because it won't always match (rsqrte != sqrte), and it looks like we'll need to add more special casing per arch than I had hoped for. Eg, x86 will prefer a different NR estimate implementation. ARM will want to use it's 'step' instructions. There also don't appear to be any new estimate instructions in any arch in a long, long time. Altivec vloge and vexpte may have been the first and last in that field... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218698 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 20:28:48 +00:00
Juergen Ributzka	9952c922c2	Recommit r218010 [FastISel][AArch64] Fold bit test and branch into TBZ and TBNZ. Note: This version fixed an issue with the TBZ/TBNZ instructions that were generated in FastISel. The issue was that the 64bit version of TBZ (TBZX) automagically sets the upper bit of the immediate field that is used to specify the bit we want to test. To test for any of the lower 32bits we have to first extract the subregister and use the 32bit version of the TBZ instruction (TBZW). Original commit message: Teach selectBranch to fold bit test and branch into a single instruction (TBZ or TBNZ). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218693 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 19:59:35 +00:00
Matt Arsenault	28233d3a63	R600/SI: Fix printing of clamp and omod No tests for omod since nothing uses it yet, but this should get rid of the remaining annoying trailing zeros after some instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218692 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 19:49:48 +00:00
Matt Arsenault	532a5c7dc6	R600/SI: Update VOP3b to not include obsolete operands abs / neg are now part of the srcN_modifiers operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218691 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 19:49:43 +00:00
Reed Kotler	8a6f79e58d	Add numeric extend, trunctate to mips fast-isel Summary: Add numeric extend, trunctate to mips fast-isel Reactivates D4827 Test Plan: fpext.ll loadstoreconv.ll Reviewers: dsanders Subscribers: mcrosier Differential Revision: http://reviews.llvm.org/D5251 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218681 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 16:30:13 +00:00
Tom Coxon	8a23890385	[AArch64] Remove unnecessary whitespace. (Test commit) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218680 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 16:23:16 +00:00
Robert Khasanov	8acdc5232d	[AVX512] Added intrinsics for 128-, 256- and 512-bit versions of VCMPGT{BWDQ}. Patch by Sergey Lisitsyn <sergey.lisitsyn@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218670 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 12:15:52 +00:00
Robert Khasanov	175ff01f0f	[AVX512] Added intrinsics for 128- and 256-bit versions of VCMPEQ{BWDQ} Fixed lowering of this intrinsics in case when mask is v2i1 and v4i1. Now cmp intrinsics lower in the following way: (i8 (int_x86_avx512_mask_pcmpeq_q_128 (v2i64 %a), (v2i64 %b), (i8 %mask))) -> (i8 (bitcast (v8i1 (insert_subvector undef, (v2i1 (and (PCMPEQM %a, %b), (extract_subvector (v8i1 (bitcast %mask)), 0))), 0)))) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218669 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 11:41:54 +00:00
Robert Khasanov	cfa5724d50	[AVX512] Added intrinsics for VPCMPEQB and VPCMPEQW. Added new operand type for intrinsics (IIT_V64) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218668 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 11:32:22 +00:00
Robert Khasanov	58da66b2bf	[AVX512] Enabled intrinsics for VPCMPEQD and VPCMPEQQ. Added CMP_MASK intrinsic type git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218667 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 11:19:50 +00:00
Job Noorman	deb16c9eac	Make sure aggregates are properly alligned on MSP430. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218665 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 11:15:44 +00:00
Chandler Carruth	4abb04a65c	[x86] Revert r218588, r218589, and r218600. These patches were pursuing a flawed direction and causing miscompiles. Read on for details. Fundamentally, the premise of this patch series was to map VECTOR_SHUFFLE DAG nodes into VSELECT DAG nodes for all blends because we are going to have to lower to VSELECT nodes for some blends to trigger the instruction selection patterns of variable blend instructions. This doesn't actually work out so well. In order to match performance with the existing VECTOR_SHUFFLE lowering code, we would need to re-slice the blend in order to fit it into either the integer or floating point blends available on the ISA. When coming from VECTOR_SHUFFLE (or other vNi1 style VSELECT sources) this works well because the X86 backend ensures that these types of operands to VSELECT get sign extended into '-1' and '0' for true and false, allowing us to re-slice the bits in whatever granularity without changing semantics. However, if the VSELECT condition comes from some other source, for example code lowering vector comparisons, it will likely only have the required bit set -- the high bit. We can't blindly slice up this style of VSELECT. Reid found some code using Halide that triggers this and I'm hopeful to eventually get a test case, but I don't need it to understand why this is A Bad Idea. There is another aspect that makes this approach flawed. When in VECTOR_SHUFFLE form, we have very distilled information that represents the constant blend mask. Converting back to a VSELECT form actually can lose this information, and so I think now that it is better to treat this as VECTOR_SHUFFLE until the very last moment and only use VSELECT nodes for instruction selection purposes. My plan is to: 1) Clean up and formalize the target pre-legalization DAG combine that converts a VSELECT with a constant condition operand into a VECTOR_SHUFFLE. 2) Remove any fancy lowering from VSELECT during legalization relying entirely on the DAG combine to catch cases where we can match to an immediate-controlled blend instruction. One additional step that I'm not planning on but would be interested in others' opinions on: we could add an X86ISD::VSELECT or X86ISD::BLENDV which encodes a fully legalized VSELECT node. Then it would be easy to write isel patterns only in terms of this to ensure VECTOR_SHUFFLE legalization only ever forms the fully legalized construct and we can't cycle between it and VSELECT combining. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218658 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 02:52:28 +00:00
Matt Arsenault	108eb07f61	Fix missing C++ mode comment git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218654 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 01:05:27 +00:00
Juergen Ributzka	a0af4b0271	[FastISel][AArch64] Fold sign-/zero-extends into the load instruction. The sign-/zero-extension of the loaded value can be performed by the memory instruction for free. If the result of the load has only one use and the use is a sign-/zero-extend, then we emit the proper load instruction. The extend is only a register copy and will be optimized away later on. Other instructions that consume the sign-/zero-extended value are also made aware of this fact, so they don't fold the extend too. This fixes rdar://problem/18495928. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218653 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 00:49:58 +00:00
Juergen Ributzka	e8a9706eda	[FastISel][AArch64] Factor out scale factor calculation. NFC. Factor out the code that determines the implicit scale factor of memory operations for a given value type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218652 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-30 00:49:54 +00:00
Eric Christopher	11bea1bff3	Simplify conditional. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218643 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 23:31:13 +00:00
Adam Nemet	e3d2fcce41	[AVX512] Use X86VectorVTInfo in the masking helper classes and the FMAs No functionality change. Makes the code more compact (see the FMA part). This needs a new type attribute MemOpFrag in X86VectorVTInfo. For now I only defined this in the simple cases. See the commment before the attribute. Diff of X86.td.expanded before and after is empty except for the appearance of the new attribute. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218637 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 22:54:41 +00:00
Eric Christopher	6a2169eb6f	Add soft-float to the key for the subtarget lookup in the TargetMachine map, this makes sure that we can compile the same code for two different ABIs (hard and soft float) in the same module. Update one testcase accordingly (and fix some confusing naming) and add a new testcase as well with the ordering swapped which would highlight the problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218632 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 21:57:54 +00:00
Eric Christopher	200f3764bf	Fix spelling and reflow comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218631 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 21:57:52 +00:00
Dave Estes	f657899174	[AArch64] Refines the Cortex-A57 Machine Model Primarily refines all of the instructions with accurate latency and micro-op information. Refinements largely focus on the NEON instructions. Additionally, a few advanced features are modeled, including forwarding for MAC instructions and hazards for floating point SQRT and DIV. Lastly, the issue-width is reduced to three so that the scheduler will better accommodate the narrower decode and dispatch width. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218627 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 21:27:36 +00:00
Matt Arsenault	0db97bb9b4	Fix include order git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 15:53:15 +00:00
Matt Arsenault	1bcadc9b5c	R600/SI: Fix hardcoded values for modifiers. Move enums to SIDefines.h git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218610 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 15:50:26 +00:00
Matt Arsenault	49cbc1891b	R600/SI: Also fix fsub + fadd a, a to mad combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218609 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 14:59:38 +00:00
Matt Arsenault	a5f45d5444	R600/SI: Fix using mad with multiplies by 2 These turn into fadds, so combine them into the target mad node. fadd (fadd (a, a), b) -> mad 2.0, a, b git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218608 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 14:59:34 +00:00
Chad Rosier	ea64dce261	[AArch64] Improve cost model to handle sdiv by a pow-of-two. This patch improves the target-specific cost model to better handle signed division by a power of two. The immediate result is that this enables the SLP vectorizer to do a better job. http://reviews.llvm.org/D5469 PR20714 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218607 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 13:59:31 +00:00
Oliver Stannard	017c6111a8	[Thumb2] ldrexd and strexd are not defined on v7M The Thumb2 ldrexd and strexd instructions are not defined for M-class architectures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218603 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 10:57:29 +00:00
Chandler Carruth	8ac2f142a8	[x86] Make the new vector shuffle lowering lower blends as VSELECT nodes, and rely exclusively on its logic. This removes a ton of duplication from the blend lowering and centralizes it in one place. One downside is that it requires a bunch of hacks to make this work with the current legalization framework. We have to manually speculate one aspect of legalizing VSELECT nodes to get everything to work nicely because the existing legalization framework isn't actually bottom-up. The other grossness is that we somewhat duplicate the analysis of constant blends. I'm on the fence here. If reviewers thing this would look better with VSELECT when it has constant operands dumping over tho VECTOR_SHUFFLE, we could go that way. But it would be a substantial change because currently all of the actual blend instructions are matched via patterns in the TD files based around VSELECT nodes (despite them not being perfect fits for that). Suggestions welcome, but at least this removes the rampant duplication in the backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218600 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 09:57:07 +00:00
Chandler Carruth	d23f1883d3	[x86] Delete a bunch of really bad and totally unnecessary code in the X86 target-specific DAG combining that tried to convert VSELECT nodes into VECTOR_SHUFFLE nodes that it "knew" would lower into immediate-controlled blend nodes. Turns out, we have perfectly good lowering of all these VSELECT nodes, and indeed that lowering already knows how to handle lowering through BLENDI to immediate-controlled blend nodes. The code just wasn't getting used much because this thing forced the world to go through the vector shuffle lowering. Yuck. This also exposes that I was too aggressive in avoiding domain crossing in v218588 with that lowering -- when the other option is to expand into two 128-bit vectors, it is worth domain crossing. Restore that behavior now that we have nice tests covering it. The test updates here fall into two camps. One is where previously we ended up with an unsigned encoding of the blend operand and now we get a signed encoding. In most of those places there were elaborate comments explaining exactly what these operands really mean. Rather than that, just switch these tests to use the nicely decoded comments that make it obvious that the final shuffle matches. The other updates are just removing pointless domain crossing by blending integers with PBLENDW rather than BLENDPS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218589 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 02:01:20 +00:00
Chandler Carruth	3589550b3e	[x86] Refactor all of the VSELECT-as-blend lowering code to avoid domain crossing and generally work more like the blend emission code in the new vector shuffle lowering. My goal is to have the new vector shuffle lowering just produce VSELECT nodes that are either matched here to BLENDI or are legal and matched in the .td files to specific blend instructions. That seems much cleaner as there are other ways to produce a VSELECT anyways. =] No observable functionality changed yet, mostly because this code appears to be near-dead. The behavior of this lowering routine did change though. This code being mostly dead and untestable will change with my next commit which will also point some new tests at it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 01:32:54 +00:00
Chandler Carruth	b3cf6a65d6	[x86] Improve naming and comments for VSELECT lowering. No functionality changed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218586 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 00:51:58 +00:00
Chandler Carruth	8e93ce1780	[x86] Add the dispatch skeleton to the new vector shuffle lowering for AVX-512. There is no interesting logic yet. Everything ends up eventually delegating to the generic code to split the vector and shuffle the halves. Interestingly, that logic does a significantly better job of lowering all of these types than the generic vector expansion code does. Mostly, it lets most of the cases fall back to nice AVX2 code rather than all the way back to SSE code paths. Step 2 of basic AVX-512 support in the new vector shuffle lowering. Next up will be to incrementally add direct support for the basic instruction set to each type (adding tests first). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218585 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 00:37:27 +00:00
Chandler Carruth	3bc1ba672c	[x86] Make the split-and-lower routine fully generic by relaxing the assertion, making the name generic, and improving the documentation. Step 1 in adding very primitive support for AVX-512. No functionality changed yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218584 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-29 00:21:49 +00:00
Chandler Carruth	b61dfec824	[x86] Teach the new vector shuffle lowering to fall back on AVX-512 vectors. Someone will need to build the AVX512 lowering, which should follow AVX1 and AVX2 very closely for AVX512F and AVX512BW resp. I've added a dummy test which is a port of the v8f32 and v8i32 tests from AVX and AVX2 to v8f64 and v8i64 tests for AVX512F and AVX512BW. Hopefully this is enough information for someone to implement proper lowering here. If not, I'll be happy to help, but right now the AVX-512 support isn't a priority for me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218583 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-28 23:53:10 +00:00
Chandler Carruth	4f4280469c	[x86] Fix the new vector shuffle lowering's use of VSELECT for AVX2 lowerings. This was hopelessly broken. First, the x86 backend wants '-1' to be the element value representing true in a boolean vector, and second the operand order for VSELECT is backwards from the actual x86 instructions. To make matters worse, the backend is just using '-1' as the true value to get the high bit to be set. It doesn't actually symbolically map the '-1' to anything. But on x86 this isn't quite how it works: there only the high bit is relevant. As a consequence weird non-'-1' values like 0x80 actually "work" once you flip the operands to be backwards. Anyways, thanks to Hal for helping me sort out what these should be. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218582 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-28 23:23:55 +00:00
Chandler Carruth	3f40848670	[x86] Fix a really silly bug that I introduced fixing another bug in the new vector shuffle target DAG combines -- it helps to actually test for the value you want rather than just using an integer in a boolean context. Have I mentioned that I loathe implicit conversions recently? :: sigh :: git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218576 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-28 06:11:04 +00:00
Chandler Carruth	21b69296fb	[x86] Fix yet another bug in the new vector shuffle lowering's handling of widening masks. We can't widen a zeroing mask unless both elements that would be merged are either zeroed or undef. This is the only way to widen a mask if it has a zeroed element. Also clean up the code here by ordering the checks in a more logical way and by using the symoblic values for undef and zero. I'm actually torn on using the symbolic values because the existing code is littered with the assumption that -1 is undef, and moreover that entries '< 0' are the special entries. While that works with the values given to these constants, using the symbolic constants actually makes it a bit more opaque why this is the case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-28 03:30:25 +00:00
Chandler Carruth	b66b0cf2eb	[x86] Fix yet another issue with widening vector shuffle elements. I spotted this by inspection when debugging something else, so I have no test case what-so-ever, and am not even sure it is possible to realistically trigger the bug. But this is what was intended here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218565 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-27 08:40:33 +00:00
Chandler Carruth	72c3b07dfd	[x86] Fix terrible bugs everywhere in the new vector shuffle lowering and in the target shuffle combining when trying to widen vector elements. Previously only one of these was correct, and we didn't correctly propagate zeroing target shuffle masks (which have a different sentinel value from undef in non- target shuffle masks now). This isn't just a missed optimization, this caused us to drop zeroing shuffles on the floor and miscompile code. The added test case is one example of that. There are other fixes to the test suite as a consequence of this as well as restoring the undef elements in some of the masks that were lost when I brought sanity to the actual value of the undef and zero sentinels. I've also just cleaned up some of the PSHUFD and PSHUFLW and PSHUFHW combining code, but that code really needs to go. It was a nice initial attempt, but it isn't very principled and the recursive shuffle combiner is much more powerful. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218562 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-27 04:42:44 +00:00
Chandler Carruth	8470b5b812	[x86] Flip the sentinel values used in the target shuffle mask decoding to significantly more sane sentinels. Notably, everywhere else in the backend's representation of shuffles uses '-1' to represent undef. The target shuffle masks really shouldn't diverge from that, especially as in a few places they are manipulated by shared code. This causes us to lose some undef lanes in various test masks. I want to get these back, but technically it isn't invalid and there are a lot of bugs here so I want to try to establish a saner baseline for fixing some of the bugs by aligning the specific senitnel values used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218561 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-27 04:42:39 +00:00
Sanjay Patel	676af35b38	Refactor reciprocal and reciprocal square root estimate into target-independent functions (part 2). This is purely refactoring. No functional changes intended. PowerPC is the only target that is currently using this interface. The ultimate goal is to allow targets other than PowerPC (certainly X86 and Aarch64) to turn this: z = y / sqrt(x) into: z = y * rsqrte(x) And: z = y / x into: z = y * rcpe(x) using whatever HW magic they can use. See http://llvm.org/bugs/show_bug.cgi?id=20900 . There is one hook in TargetLowering to get the target-specific opcode for an estimate instruction along with the number of refinement steps needed to make the estimate usable. Differential Revision: http://reviews.llvm.org/D5484 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218553 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 23:01:47 +00:00
Chandler Carruth	0a31a52b91	[x86] Fix a moderately terrifying bug in the new 128-bit shuffle logic that managed to elude all of my fuzz testing historically. =/ Something changed to allow this code path to actually be exercised and it was doing bad things. It is especially heavily exercised by the patterns that emerge when doing AVX shuffles that end up lowered through the 128-bit code path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218540 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 20:41:45 +00:00
Matt Arsenault	07b7c98d61	R600/SI: Use break instead of continue If an instruction doesn't have src1, it doesn't have src2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218536 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 17:55:14 +00:00
Matt Arsenault	88416c337b	R600/SI: Add a note about the order of the operands to div_scale git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218534 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 17:55:09 +00:00
Matt Arsenault	508b8db287	R600/SI: Move finding SGPR operand to move to separate function git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218533 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 17:55:06 +00:00
Matt Arsenault	d991d2217b	R600/SI Allow same SGPR to be used for multiple operands Instead of moving the first SGPR that is different than the first, legalize the operand that requires the fewest moves if one SGPR is used for multiple operands. This saves extra moves and is also required for some instructions which require that the same operand be used for multiple operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218532 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-26 17:55:03 +00:00

1 2 3 4 5 ...

30884 Commits