llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 04:30:12 +00:00

Author	SHA1	Message	Date
Christian Pirker	94708f1784	AArch64_BE function argument passing for ARM ABI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204814 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 14:51:22 +00:00
Tim Northover	fc4fa22846	ARM: add intrinsics for the v8 ldaex/stlex We've already got versions without the barriers, so this just adds IR-level support for generating the new v8 ones. rdar://problem/16227836 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204813 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 14:39:31 +00:00
Matheus Almeida	637e1da9e9	[mips] Hoist common functionality into a new function. Given that we support multiple directives that enable a particular feature (e.g. '.set mips16'), it's best to hoist that code into a new function so that we don't repeat the same pattern w.r.t parsing and handling error cases. No functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 14:26:27 +00:00
Renato Golin	58839f43de	Change @llvm.clear_cache default to call rt-lib After some discussion on IRC, emitting a call to the library function seems like a better default, since it will move from a compiler internal error to a linker error, that the user can work around until LLVM is fixed. I'm also adding a note on the responsibility of the user to confirm that the cache was cleared on platforms where nothing is done. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204806 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 14:01:32 +00:00
Daniel Sanders	cee1aecc57	[mips] The decision to use MO_GOT_PAGE and MO_GOT_OFST depends on the ABI being N32 or N64 not the arch being MIPS64 Summary: No functional change (in supported use cases) Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3177 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204805 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 13:59:42 +00:00
Cameron McInally	4de1039403	Fix AVX512 Gather and Scatter execution domains. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204804 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 13:50:50 +00:00
Matheus Almeida	0de31d52d5	[mips] Add support for '.option pic2'. The directive '.option pic2' enables PIC from assembly source. At the moment none of the macros/directives check the PIC bit but that's going to be fixed relatively soon. For example, the expansion of macros like 'la' depend on the relocation model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204803 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 13:40:29 +00:00
Renato Golin	c4b058f9e7	Add @llvm.clear_cache builtin Implementing the LLVM part of the call to __builtin___clear_cache which translates into an intrinsic @llvm.clear_cache and is lowered by each target, either to a call to __clear_cache or nothing at all incase the caches are unified. Updating LangRef and adding some tests for the implemented architectures. Other archs will have to implement the method in case this builtin has to be compiled for it, since the default behaviour is to bail unimplemented. A Clang patch is required for the builtin to be lowered into the llvm intrinsic. This will be done next. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204802 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 12:52:28 +00:00
Hal Finkel	159e7f4095	[PowerPC] Lower VSELECT using xxsel when VSX is available With VSX there is a real vector select instruction, and so we should use it. Note that VSELECT will still scalarize for v2f64 because the corresponding SetCC result type (v2i64) is not currently a legal type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204801 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 12:49:28 +00:00
Daniel Sanders	968ea7b82c	[mips] The register names depend on the ABI being N32/N64 rather than the arch being mips64 Summary: Added test cases for O32 and N32 on MIPS64. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204796 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 11:39:07 +00:00
Daniel Sanders	95f4d65d4f	[mips] $s8 is an alias for $fp in all ABI's, not just N32/N64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204793 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 11:05:24 +00:00
Rafael Espindola	72db10a995	Revert "Prevent alias from pointing to weak aliases." This reverts commit r204781. I will follow up to with msan folks to see what is what they were trying to do with aliases to weak aliases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204784 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 06:14:40 +00:00
Hal Finkel	360ee97179	[PowerPC] Generate logical vector VSX instructions These instructions are essentially the same as their Altivec counterparts, but have access to the larger VSX register file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204782 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 04:55:40 +00:00
Rafael Espindola	33845aa8c4	Prevent alias from pointing to weak aliases. Aliases are just another name for a position in a file. As such, the regular symbol resolutions are not applied. For example, given define void @my_func() { ret void } @my_alias = alias weak void ()* @my_func @my_alias2 = alias void ()* @my_alias We produce without this patch: .weak my_alias my_alias = my_func .globl my_alias2 my_alias2 = my_alias That is, in the resulting ELF file my_alias, my_func and my_alias are just 3 names pointing to offset 0 of .text. That is not the semantics of IR linking. For example, linking in a @my_alias = alias void ()* @other_func would require the strong my_alias to override the weak one and my_alias2 would end up pointing to other_func. There is no way to represent that with aliases being just another name, so the best solution seems to be to just disallow it, converting a miscompile into an error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204781 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 04:48:47 +00:00
Quentin Colombet	596516bef8	[X86] Add broadcast instructions to the table used by ExeDepsFix pass. Adds the different broadcast instructions to the ReplaceableInstrsAVX2 table. That way the ExeDepsFix pass can take better decisions when AVX2 broadcasts are across domain (int <-> float). In particular, prior to this patch we were generating: vpbroadcastd LCPI1_0(%rip), %ymm2 vpand %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 ## <- domain change penalty Now, we generate the following nice sequence where everything is in the float domain: vbroadcastss LCPI1_0(%rip), %ymm2 vandps %ymm2, %ymm0, %ymm0 vmaxps %ymm1, %ymm0, %ymm0 <rdar://problem/16354675> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204770 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-26 00:10:22 +00:00
Hal Finkel	6a0f060f64	[PowerPC] Select between VSX A-type and M-type FMA instructions just before RA The VSX instruction set has two types of FMA instructions: A-type (where the addend is taken from the output register) and M-type (where one of the product operands is taken from the output register). This adds a small pass that runs just after MI scheduling (and, thus, just before register allocation) that mutates A-type instructions (that are created during isel) into M-type instructions when: 1. This will eliminate an otherwise-necessary copy of the addend 2. One of the product operands is killed by the instruction The "right" moment to make this decision is in between scheduling and register allocation, because only there do we know whether or not one of the product operands is killed by any particular instruction. Unfortunately, this also makes the implementation somewhat complicated, because the MIs are not in SSA form and we need to preserve the LiveIntervals analysis. As a simple example, if we have: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 ... %vreg9<def,tied1> = XSMADDADP %vreg9<tied0>, %vreg17, %vreg19, %RM<imp-use>; VSLRC:%vreg9,%vreg17,%vreg19 ... We can eliminate the copy by changing from the A-type to the M-type instruction. This means: %vreg5<def,tied1> = XSMADDADP %vreg5<tied0>, %vreg17, %vreg16, %RM<imp-use>; VSLRC:%vreg5,%vreg17,%vreg16 is replaced by: %vreg16<def,tied1> = XSMADDMDP %vreg16<tied0>, %vreg18, %vreg9, %RM<imp-use>; VSLRC:%vreg16,%vreg18,%vreg9 and we remove: %vreg5<def> = COPY %vreg9; VSLRC:%vreg5,%vreg9 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204768 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 23:29:21 +00:00
Hal Finkel	7e77dabbd0	[PowerPC] Correct commutable indices for VSX FMA instructions Although the first two operands are the ones that can be swapped, the tied input operand is listed before them, so we need to adjust for that. I have a test case for this, but it goes along with an upcoming commit (so it will come soon). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204748 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 19:26:43 +00:00
Hal Finkel	fba0a057a2	[PowerPC] Add a TableGen relation for A-type and M-type VSX FMA instructions TableGen will create a lookup table for the A-type FMA instructions providing their corresponding M-form opcodes. This will be used by upcoming commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204746 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 18:55:11 +00:00
Matt Arsenault	ab5382f5eb	R600: Move computeMaskedBitsForTargetNode out of AMDILISelLowering.cpp Remove handling of select_cc, since it makes no sense to be there. This now does nothing, but I'll be adding some handling of other target nodes soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204743 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 18:18:27 +00:00
Juergen Ributzka	feaa46379a	[X86TTI] Make constant base pointers for getElementPtr opaque. If getElementPtr uses a constant as base pointer, then make the constant opaque. This prevents constant folding it with the offset. The offset can usually be encoded in the load/store instruction itself and the base address doesn't have to be rematerialized several times. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204739 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 18:01:25 +00:00
Juergen Ributzka	e987eb12b6	[Stackmaps][X86TTI] Fix think-o in getIntImmCost calculation. The cost for the first four stackmap operands was always TCC_Free. This is only true for the first two operands. All other operands are TCC_Free if they are within 64bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204738 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 18:01:23 +00:00
Adam Nemet	6f4f46cf11	[X86] Generate VPSHUFB for in-place v16i16 shuffles This used to resort to splitting the 256-bit operation into two 128-bit shuffles and then recombining the results. Fixes <rdar://problem/16167303> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204735 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 17:47:06 +00:00
Adam Nemet	9526911809	[X86] Factor out new helper getPSHUFB I found three implementations of this. This splits it out into a new function and uses it from the three places. My plan is to add a fourth use when lowering a vector_shuffle:v16i16. Compared the assembly output of test/CodeGen/X86 before and after. The only change is due to how the first PSHUFB was generated in LowerVECTOR_SHUFFLEv8i16. If the shuffle mask specified undef (i.e. -1), the old implementation would write -1 * 2 and -1 * 2 + 1 (254 and 255) in the control mask. Now we write 0x80. These are of course interchangeable since bit 7 decides if a constant zero is written in the result byte. The other instances of this code use 0x80 consistently. Related to <rdar://problem/16167303> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 17:47:03 +00:00
Daniel Sanders	3527e5fcb3	[mips] '.set at=$0' should be equivalent to '.set noat' Differential Revision: http://llvm-reviews.chandlerc.com/D3171 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204714 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 13:01:06 +00:00
Cameron McInally	3ec862b7ae	Fix AVX2 Gather execution domains. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204713 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 12:36:38 +00:00
Daniel Sanders	c7e1663c24	[mips] Correct testcase for .set at=$reg and emit the new warnings for numeric registers too. Summary: Remove the XFAIL added in my previous commit and correct the test such that it correctly tests the expansion of the assembler temporary. Also added a test to check that $at is always $1 when written by the user. Corrected the new assembler temporary warnings so that they are emitted for numeric registers too. Differential Revision: http://llvm-reviews.chandlerc.com/D3169 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204711 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 11:16:03 +00:00
Daniel Sanders	c141b331b9	[mips] Fix assembler temporary expansion and add associated warnings about the use of $at. Summary: The assembler temporary is normally $at ($1) but can be reassigned using '.set at=$reg'. Regardless of which register is nominated as the assembler temporary, $at remains $1 when written by the user. Adds warnings under the following conditions: * The register nominated as the assembler temporary is used by the user. * '.set noat' is in effect and $at is used by the user. Both of these only work for named registers. I have a follow up commit that makes it work for numeric registers as well. XFAIL set-at-directive.s since it incorrectly tests that $at is redefined by '.set at=$reg'. Testcases will follow in a separate commit. Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3167 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204710 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 10:57:07 +00:00
Kevin Enderby	4a88cd08da	Fix crashes when assembler directives are used that are not for Mach-O object files by generating an error instead. rdar://16335232 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204687 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-25 00:05:50 +00:00
Matt Arsenault	e130844e41	R600: Don't viewCFG() under DEBUG() except on failure. Having these popping up every time you use -debug is really irritating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204664 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 20:29:02 +00:00
Matt Arsenault	add2e2ec8f	R600/SI: Fix extra mov from legalizing 64-bit SALU ops. Check the register class of each operand individually to avoid an extra copy to a vgpr. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204662 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 20:08:13 +00:00
Matt Arsenault	3a96e61469	R600/SI: Sub-optimial fix for 64-bit immediates with SALU ops. No longer asserts, but now you get moves loading legal immediates into the split 32-bit operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204661 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 20:08:09 +00:00
Matt Arsenault	db1807144a	R600/SI: Fix 64-bit bit ops that require the VALU. Try to match scalar and first like the other instructions. Expand 64-bit ands to a pair of 32-bit ands since that is not available on the VALU. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204660 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 20:08:05 +00:00
Matt Arsenault	6c199d8212	R600: Implement isNarrowingProfitable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204658 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 19:43:31 +00:00
Matt Arsenault	03cd663eb1	R600/SI: Move splitting 64-bit immediates to separate function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204651 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 18:26:52 +00:00
Ulrich Weigand	47eac58333	[PowerPC] Generate little-endian object files As a first step towards real little-endian code generation, this patch changes the PowerPC MC layer to actually generate little-endian object files. This involves passing the little-endian flag through the various layers, including down to createELFObjectWriter so we actually get basic little-endian ELF objects, emitting instructions in little-endian order, and handling fixups and relocations as appropriate for little-endian. The bulk of the patch is to update most test cases in test/MC/PowerPC to verify both big- and little-endian encodings. (The only test cases not updated are those that create actual big-endian ABI code, like the TLS tests.) Note that while the object files are now little-endian, the generated code itself is not yet updated, in particular, it still does not adhere to the ELFv2 ABI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 18:16:09 +00:00
Quentin Colombet	4768df00c4	[X86][ISelDAG] Add missing fallback patterns for avx2 broadcast instructions. Those patterns are used when the load cannot be folded into the related broadcast during the select phase. This happens when the load gets additional uses that were not anticipated during the previous lowering phases (constant vector to constant load, then constant load reused) or when selection DAG is not able to prove that folding the load will not create a cycle in the DAG. <rdar://problem/16074331> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204631 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 17:54:19 +00:00
Matt Arsenault	875870fdb4	R600/SI: Fix 64-bit private loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204630 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 17:50:46 +00:00
Adam Nemet	a1b54dd1ff	[X86] Fix non-determinism in LowerVectorAllZeroTest This can be observed with the old testcase of CodeGen/X86/pr12312.ll: 47c47 < vorps %ymm0, %ymm1, %ymm0 --- > vorps %ymm1, %ymm0, %ymm0 97c97 < vorps %ymm1, %ymm0, %ymm0 --- > vorps %ymm0, %ymm1, %ymm0 The vector VecIns is populated with all the values from VecInMap. This is done while iterating VecInMap. VecInMap uses a hash of pointer values so the resulting order can vary depending on the memory layout. The fix is to populate the vector VecIns earlier as VecInMap is populated. This is done in DAG traversal order. Fixes <rdar://problem/16398806> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204623 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:52:08 +00:00
Daniel Sanders	8ce101ed10	[mips] Add error message when trying to use $at in '.set noat' mode. Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:48:01 +00:00
Eli Bendersky	2685aa8713	Removes the NVPTXSplitBBatBar pass. This pass is a historic remnant and actually causes less efficient code to be generated in some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204620 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:36:39 +00:00
Tom Stellard	4ddee6a5da	R600/SI: Fix warning with gcc 4.8.2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204618 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:12:34 +00:00
Tom Stellard	65b5e9b4ef	R600/SI: Promote fp64 SELECT to i64 This type promotion is replacing a Tablegen pattern and it is already covered by existing tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204617 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:07:30 +00:00
Tom Stellard	9958475129	R600: Reorganize tablegen instruction definitions Each GPU family now has its own file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204615 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:07:25 +00:00
Will Schmidt	04c252cc93	[PPC64LE] ELFv2 ABI updates for the .opd section [PPC64LE] ELFv2 ABI updates for the .opd section The PPC64 Little Endian (PPC64LE) target supports the ELFv2 ABI, and as such, does not have a ".opd" section. This is keyed off a _CALL_ELF=2 macro check. The CALL_ELF check is not clearly documented at this time. The basis for usage in this patch is from the gcc thread here: http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01144.html > Adding comment from Uli: Looks good to me. I think the old-style JIT doesn't really work anyway for 64-bit, but at least with this patch LLVM will compile and link again on a ppc64le host ... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204614 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 16:04:15 +00:00
Daniel Sanders	eaae583095	[mips] Allow dsubu to take an immediate as an alias for dsubiu. Summary: Patch by David Chisnall His work was sponsored by: DARPA, AFRL Differential Revision: http://llvm-reviews.chandlerc.com/D3155 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 15:38:00 +00:00
Hal Finkel	38a80e9b21	[PowerPC] Mark many instructions as commutative I'm under the impression that we used to infer the isCommutable flag from the instruction-associated pattern. Regardless, we don't seem to do this (at least by default) any more. I've gone through all of our instruction definitions, and marked as commutative all of those that should be trivial to commute (by exchanging the first two operands). There has been special code for the RL* instructions, and that's not changed. Before this change, we had the following commutative instructions: RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo XSADDDP XSMULDP XVADDDP XVADDSP XVMULDP XVMULSP After: ADD4 ADD4o ADD8 ADD8o ADDC ADDC8 ADDC8o ADDCo ADDE ADDE8 ADDE8o ADDEo AND AND8 AND8o ANDo CRAND CREQV CRNAND CRNOR CROR CRXOR EQV EQV8 EQV8o EQVo FADD FADDS FADDSo FADDo FMADD FMADDS FMADDSo FMADDo FMSUB FMSUBS FMSUBSo FMSUBo FMUL FMULS FMULSo FMULo FNMADD FNMADDS FNMADDSo FNMADDo FNMSUB FNMSUBS FNMSUBSo FNMSUBo MULHD MULHDU MULHDUo MULHDo MULHW MULHWU MULHWUo MULHWo MULLD MULLDo MULLW MULLWo NAND NAND8 NAND8o NANDo NOR NOR8 NOR8o NORo OR OR8 OR8o ORo RLDIMI RLDIMIo RLWIMI RLWIMI8 RLWIMI8o RLWIMIo VADDCUW VADDFP VADDSBS VADDSHS VADDSWS VADDUBM VADDUBS VADDUHM VADDUHS VADDUWM VADDUWS VAND VAVGSB VAVGSH VAVGSW VAVGUB VAVGUH VAVGUW VMADDFP VMAXFP VMAXSB VMAXSH VMAXSW VMAXUB VMAXUH VMAXUW VMHADDSHS VMHRADDSHS VMINFP VMINSB VMINSH VMINSW VMINUB VMINUH VMINUW VMLADDUHM VMULESB VMULESH VMULEUB VMULEUH VMULOSB VMULOSH VMULOUB VMULOUH VNMSUBFP VOR VXOR XOR XOR8 XOR8o XORo XSADDDP XSMADDADP XSMAXDP XSMINDP XSMSUBADP XSMULDP XSNMADDADP XSNMSUBADP XVADDDP XVADDSP XVMADDADP XVMADDASP XVMAXDP XVMAXSP XVMINDP XVMINSP XVMSUBADP XVMSUBASP XVMULDP XVMULSP XVNMADDADP XVNMADDASP XVNMSUBADP XVNMSUBASP XXLAND XXLNOR XXLOR XXLXOR This is a by-inspection change, and I'm not sure how to write a reliable test case. I would like advice on this, however. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204609 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 15:07:28 +00:00
Daniel Sanders	67db74e02c	[mips] Implement shorthand add / sub forms for MIPS. Summary: - If only two registers are passed to a three-register operation, then the first argument is both source and destination register. - If a non-register is passed as the last argument, generate the immediate version of the instruction. Also mark DADD commutative and add scheduling information (to the generic scheduler), and implement DSUB. Patch by David Chisnall His work was sponsored by: DARPA, AFRL CC: theraven Differential Revision: http://llvm-reviews.chandlerc.com/D3148 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204605 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 14:05:39 +00:00
Justin Holewinski	b8cb709858	[NVPTX] Add isel patterns for addrspacecast git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204600 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 11:17:53 +00:00
Hal Finkel	2393d22ca4	[PowerPC] Don't schedule VSX copy legalization unless VSX is enabled There is no need to schedule this extra pass if it will have nothing to do. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204594 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 09:51:41 +00:00
Hal Finkel	72448143b5	[PowerPC] Update comment re: VSX copy-instruction selection I've done some experimentation with this, and it looks like using the lower-latency (but lower throughput) copy instruction is essentially always the right thing to do. My assumption is that, in order to be relatively sure that the higher-latency copy will increase throughput, we'd want to have it unlikely to be in-flight with its use. On the P7, the global completion table (GCT) can hold a maximum of 120 instructions, shared among all active threads (up to 4), giving 30 instructions per thread. So specifically, I'd require at least that many instructions between the copy and the use before the high-latency variant is used. Trying this, however, over the entire test suite resulted in zero cases where the high-latency form would be preferable. This may be a consequence of the fact that the scheduler views copies as free, and so they tend to end up close to their uses. For this experiment I created a function: unsigned chooseVSXCopy(MachineBasicBlock &MBB, MachineBasicBlock::iterator I, unsigned DestReg, unsigned SrcReg, unsigned StartDist = 1, unsigned Depth = 3) const; with an implementation like: if (!Depth) return PPC::XXLOR; const unsigned MaxDist = 30; unsigned Dist = StartDist; for (auto J = I, JE = MBB.end(); J != JE && Dist <= MaxDist; ++J) { if (J->isTransient() && !J->isCopy()) continue; if (J->isCall() \|\| J->isReturn() \|\| J->readsRegister(DestReg, TRI)) return PPC::XXLOR; ++Dist; } // We've exceeded the required distance for the high-latency form, use it. if (Dist > MaxDist) return PPC::XVCPSGNDP; // If this is only an exit block, use the low-latency form. if (MBB.succ_empty()) return PPC::XXLOR; // We've reached the end of the block, check the successor blocks (up to some // depth), and use the high-latency form if that is okay with all successors. for (auto J = MBB.succ_begin(), JE = MBB.succ_end(); J != JE; ++J) { if (chooseVSXCopy(*J, (J)->begin(), DestReg, SrcReg, Dist, --Depth) == PPC::XXLOR) return PPC::XXLOR; } // All of our successor blocks seem okay with the high-latency variant, so // we'll use it. return PPC::XVCPSGNDP; and then changed the copy opcode selection from: Opc = PPC::XXLOR; to: Opc = chooseVSXCopy(MBB, std::next(I), DestReg, SrcReg); In conclusion, I'm removing the FIXME from the comment, because I believe that there is, at least absent other examples, nothing to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204591 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-24 09:36:36 +00:00

1 2 3 4 5 ...

28153 Commits