llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-28 19:31:58 +00:00

Author	SHA1	Message	Date
Adam Nemet	d290fa608f	[X86] Improve buildFromShuffleMostly for AVX For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors, both the BUILD_VECTOR and its operands may need to be legalized in multiple steps. Consider: (v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>), (extract_vector_elt %vreg0, Constant<2>), (extract_vector_elt %vreg0, Constant<3>), (extract_vector_elt %vreg0, Constant<4>), (extract_vector_elt %vreg0, Constant<5>), (extract_vector_elt %vreg0, Constant<6>), (extract_vector_elt %vreg0, Constant<7>), %vreg1)) a. We can't build a 256-bit vector efficiently so, we need to split it into two 128-bit vecs and combine them with VINSERTX128. b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be split into a VEXTRACTX128 and a further extract_vector_elt from the resulting 128-bit vector. c. The extract_vector_elt from b. is lowered into a shuffle to the first element and a movss. Depending on the order in which we legalize the BUILD_VECTOR and its operands[1], buildFromShuffleMostly may be faced with: (v4f32 (BUILD_VECTOR (extract_vector_elt (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), (extract_vector_elt (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef), Constant<0>), %vreg1)) In order to figure out the underlying vector and their identity we need to see through the shuffles. [1] Note that the order in which operations and their operands are legalized is only guaranteed in the first iteration of LegalizeDAG. Fixes <rdar://problem/16296956> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 19:44:16 +00:00
Duncan P. N. Exon Smith	ebb5d29473	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" (#2 ) This reverts commit r206622 and the MSVC fixup in r206626. Apparently the remotely failing tests are still failing, despite my attempt to fix the nondeterminism in r206621. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206628 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:56:08 +00:00
Duncan P. N. Exon Smith	54850bedf2	Reapply "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commit r206556, effectively reapplying commit r206548 and its fixups in r206549 and r206550. In an intervening commit I've added target triples to the tests that were failing remotely [1] (but passing locally). I'm hoping the mystery is solved? I'll revert this again if the tests are still failing remotely. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206622 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:22:25 +00:00
Duncan P. N. Exon Smith	1e1954f749	Add some target triples for better determinism These tests were failing on some buildbots after r206548 (reverted in r206556), but passing locally. They were missing target triples, so maybe that's the problem? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 17:22:19 +00:00
Tim Northover	7b4b261611	AArch64/ARM64: add more NEON tests. Mostly no testing this time, since they were just wrangling target-specific intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206613 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:53 +00:00
Tim Northover	f34a512a68	ARM64: disable generation of .loh directives outside MachO. Part of PR19455. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:46 +00:00
Tim Northover	9cfd368302	ARM64: don't emit .subsections_via_symbols on ELF. Part of PR19455. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206610 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:41 +00:00
Tim Northover	1d5a2ad8a6	ARM64: add extra NEG pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206609 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 14:54:35 +00:00
Tim Northover	936285440b	AArch64/ARM64: port more AArch64 tests to ARM64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206592 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 13:16:55 +00:00
Tim Northover	753cfe6172	AArch64/ARM64: add non-scalar lowering for more FCVT operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206591 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 13:16:42 +00:00
Tim Northover	7b4b522ec8	AArch64/ARM64: improve spotting of EXT instructions from VECTOR_SHUFFLE. We couldn't cope if the first mask element was UNDEF before, which isn't ideal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 12:50:58 +00:00
Benjamin Kramer	c32e261a1a	X86: Pattern match scalar loads + vcvtph2ps into just vcvtph2ps. vcvtph2ps only reads the lower 64 bits of the address passed to the intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206579 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 10:45:33 +00:00
Tim Northover	fb96efa7dd	AArch64/ARM64: port atomics test to ARM64. Covers quite a few extra instructions (like any of the max/min ones which were broken until recently on ARM64). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:31 +00:00
Tim Northover	0d6995985a	AArch64/ARM64: spot a greater variety of concat_vector operations. Code mostly copied from AArch64, just tidied up a trifle and plumbed into the ARM64 way of doing things. This also enables the AArch64 tests which inspired the previous untested commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:27 +00:00
Tim Northover	70b63374f2	ARM64: implement cunning optimisation from AArch64 A vector extract followed by a dup can become a single instruction even if the types don't match. AArch64 handled this in ISelLowering, but a few reasonably simple patterns can take care of it in TableGen, so that's where I've put it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206573 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:20 +00:00
Tim Northover	66643da8fc	AArch64/ARM64: emit all vector FP comparisons as such. ARM64 was scalarizing some vector comparisons which don't quite map to AArch64's compare and mask instructions. AArch64's approach of sacrificing a little efficiency to emulate them with the limited set available was better, so I ported it across. More "inspired by" than copy/paste since the backend's internal expectations were a bit different, but the tests were invaluable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:07 +00:00
Tim Northover	937290d7ed	AArch64/ARM64: port BSL logic from AArch64 & enable test. I enhanced it a little in the process. The decision shouldn't really be beased on whether a BUILD_VECTOR is a splat: any set of constants will do the job provided they're related in the correct way. Also, the BUILD_VECTOR could be any operand of the incoming AND nodes, so it's best to check for all 4 possibilities rather than assuming it'll be the RHS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206569 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:31:01 +00:00
Tim Northover	2f5d14af9d	AArch64/ARM64: copy byval implementation from AArch64. It's not actually used to handle C or C++ ABI rules on ARM64, but could well be emitted by other language front-ends, so it's as well to have a sensible implementation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206568 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:30:52 +00:00
Jiangning Liu	bc3655f9c8	This is one of the optimizations ported from ARM64 to AArch64 to address the performance gap between these two back ends. The test case newly added for AArch64 already exists in ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 05:58:09 +00:00
Matt Arsenault	746734df1a	R600/SI: Try to use scalar BFE. Use scalar BFE with constant shift and offset when possible. This is complicated by the fact that the scalar version packs the two operands of the vector version into one. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206558 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 05:19:26 +00:00
Jiangning Liu	532a5ffe4c	This commit enables unaligned memory accesses of vector types on AArch64 back end. This should boost vectorized code performance. Patched by Z. Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206557 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 03:58:38 +00:00
Duncan P. N. Exon Smith	c7a3b95c0f	Revert "blockfreq: Rewrite BlockFrequencyInfoImpl" This reverts commits r206548, r206549 and r206549. There are some unit tests failing that aren't failing locally [1], so reverting until I have time to investigate. [1]: http://bb.pgr.jp/builders/ninja-x64-msvc-RA-centos6/builds/1816 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206556 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 02:17:43 +00:00
Duncan P. N. Exon Smith	cc1e1707b8	blockfreq: Rewrite BlockFrequencyInfoImpl Rewrite the shared implementation of BlockFrequencyInfo and MachineBlockFrequencyInfo entirely. The old implementation had a fundamental flaw: precision losses from nested loops (or very wide branches) compounded past loop exits (and convergence points). The @nested_loops testcase at the end of test/Analysis/BlockFrequencyAnalysis/basic.ll is motivating. This function has three nested loops, with branch weights in the loop headers of 1:4000 (exit:continue). The old analysis gives non-sensical results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': ---- Block Freqs ---- entry = 1.0 for.cond1.preheader = 1.00103 for.cond4.preheader = 5.5222 for.body6 = 18095.19995 for.inc8 = 4.52264 for.inc11 = 0.00109 for.end13 = 0.0 The new analysis gives correct results: Printing analysis 'Block Frequency Analysis' for function 'nested_loops': block-frequency-info: nested_loops - entry: float = 1.0, int = 8 - for.cond1.preheader: float = 4001.0, int = 32007 - for.cond4.preheader: float = 16008001.0, int = 128064007 - for.body6: float = 64048012001.0, int = 512384096007 - for.inc8: float = 16008001.0, int = 128064007 - for.inc11: float = 4001.0, int = 32007 - for.end13: float = 1.0, int = 8 Most importantly, the frequency leaving each loop matches the frequency entering it. The new algorithm leverages BlockMass and PositiveFloat to maintain precision, separates "probability mass distribution" from "loop scaling", and uses dithering to eliminate probability mass loss. I have unit tests for these types out of tree, but it was decided in the review to make the classes private to BlockFrequencyInfoImpl, and try to shrink them (or remove them entirely) in follow-up commits. The new algorithm should generally have a complexity advantage over the old. The previous algorithm was quadratic in the worst case. The new algorithm is still worst-case quadratic in the presence of irreducible control flow, but it's linear without it. The key difference between the old algorithm and the new is that control flow within a loop is evaluated separately from control flow outside, limiting propagation of precision problems and allowing loop scale to be calculated independently of mass distribution. Loops are visited bottom-up, their loop scales are calculated, and they are replaced by pseudo-nodes. Mass is then distributed through the function, which is now a DAG. Finally, loops are revisited top-down to multiply through the loop scales and the masses distributed to pseudo nodes. There are some remaining flaws. - Irreducible control flow isn't modelled correctly. LoopInfo and MachineLoopInfo ignore irreducible edges, so this algorithm will fail to scale accordingly. There's a note in the class documentation about how to get closer. See also the comments in test/Analysis/BlockFrequencyInfo/irreducible.ll. - Loop scale is limited to 4096 per loop (2^12) to avoid exhausting the 64-bit integer precision used downstream. - The "bias" calculation proposed on llvmdev is not incorporated here. This will be added in a follow-up commit, once comments from this review have been handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206548 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 01:57:45 +00:00
Matt Arsenault	6834a55df3	R600/SI: Match sign_extend_inreg to s_sext_i32_i8 and s_sext_i32_i16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206547 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 01:53:18 +00:00
Tom Stellard	cfe02c46dc	R600/SI: Use SReg_64 instead of VSrc_64 when selecting BUILD_PAIR git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206541 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 00:36:21 +00:00
Louis Gerbarg	fc8fa8238d	Make test/CodeGen/ARM64/vector-insertion.ll explicitly select neon syntax Change the command line vector-insertion.ll to explicitly set the neon syntax to apple so that buildbots that default to other syntaxes won't fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206502 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 21:32:41 +00:00
Tom Stellard	93ea1378d2	R600/SI: Stop using i128 as the resource descriptor type Having i128 as a legal type complicates the legalization phase. v4i32 is already a legal type, so we will use that instead. This fixes several piglit tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206500 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 21:00:11 +00:00
Louis Gerbarg	5540570374	Improve ARM64 vector creation This patch improves the performance of vector creation in caseiswhere where several of the lanes in the vector are a constant floating point value. It also includes new patterns to fold together some of the instructions when the value is 0.0f. Test cases included. rdar://16349427 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206496 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 20:51:50 +00:00
Jim Grosbach	4af58f145d	ARM64: [su]xtw use W regs as inputs, not X regs. Update the SXT[BHW]/UXTW instruction aliases and the shifted reg addressing mode handling. PR19455 and rdar://16650642 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206495 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 20:47:31 +00:00
Tim Northover	90dd89ed81	ARM64: switch to IR-based atomic operations. Goodbye code! (Game: spot the bug fixed by the change). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206490 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 20:00:33 +00:00
Tim Northover	fa9a0aa77b	ARM64: add acquire/release versions of the existing atomic intrinsics. These will be needed to support IR-level lowering of atomic operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 20:00:24 +00:00
Josh Magee	a32348530f	[stack protector] Make the StackProtector pass respect ssp-buffer-size. Previously, SSPBufferSize was assigned the value of the "stack-protector-buffer-size" attribute after all uses of SSPBufferSize. The effect was that the default SSPBufferSize was always used during analysis. I moved the check for the attribute before the analysis; now --param ssp-buffer-size= works correctly again. Differential Revision: http://reviews.llvm.org/D3349 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206486 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 19:08:36 +00:00
Matt Arsenault	9e383d4b48	R600/SI: f64 frint is legal on CI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206475 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 17:06:37 +00:00
Matt Arsenault	003de065a3	R600/SI: Fix zext from i1 to i64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206437 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 02:03:08 +00:00
Adam Nemet	e1a38f7041	[ARM64] Fix "Cannot select" for vector ctpop The commit of r205855: Author: Arnold Schwaighofer <aschwaighofer@apple.com> Date: Wed Apr 9 14:20:47 2014 +0000 SLPVectorizer: Only vectorize intrinsics whose operands are widened equally The vectorizer only knows how to vectorize intrinics by widening all operands by the same factor. Patch by Tyler Nowicki! exposed a backend bug causing a regression (Cannot select ctpop). The commit msg is a bit confusing because the patch actually changes the behavior for the loop-vectorizer as well. As things got refactored into a helper ctpop got snuck in to the trivially-vectorizable helper which is now used by both vectorizers. In other words, we started seeing vector-ctpops in the backend. This change makes ctpop LegalizeAction::Expand for the types not supported by the byte-only CNT instruction. We may be able to custom-lower these later to a single CNT but this is to fix the compiler crash first. Fixes <rdar://problem/16578951> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206433 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 01:01:37 +00:00
Matheus Almeida	c308f165a0	[mips] Add initial support for NaN2008 in the back-end. This is so that EF_MIPS_NAN2008 is set if we are using IEEE 754-2008 NaN encoding (-mnan=2008). This patch also adds support for parsing '.nan legacy' and '.nan 2008' assembly directives. The handling of these directives should match GAS' behaviour i.e., the last directive in use sets the ELF header bit (EF_MIPS_NAN2008). Differential Revision: http://reviews.llvm.org/D3346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206396 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 15:48:55 +00:00
Tim Northover	f539725734	AArch64/ARM64: port some NEON tests to ARM64 These ones used completely different sets of intrinsics, so the only way to do it is create a separate ARM64 copy and change them all. Other than that, CodeGen was straightforward, no deficiencies detected here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206392 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 15:28:02 +00:00
Daniel Sanders	4134d06487	[mips] Fix emission of '.option pic0' for MIPS-IV. Summary: This was a case of incorrect usage of hasMips64() vs isABI_N64() Reviewers: matheusalmeida, dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D3398 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206388 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 13:58:57 +00:00
Daniel Sanders	849ca451c8	[mips] Correct r206370 to account for non-Linux targets using the small data section. This should fix the ninja-x64-msvc-RA-centos6 builder. I suspect the check in MipsSubtarget.cpp is incorrect and is really trying to check for a bare-metal target rather and anything other than linux. I'll investigate this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206385 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 12:29:08 +00:00
Tim Northover	115d4f407b	ARM64: specify triple so that Linux tests pass Now that Linux is trying to reparse all inline asm it chokes on the different comment character in this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206382 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 12:03:56 +00:00
Tim Northover	1a8adcb569	AArch64/ARM64: add another set of tests from AArch64 Another batch with no code changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206381 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 11:53:07 +00:00
Tim Northover	1a44333f0e	AArch64/ARM64: port across stub handling for ELF C++ exceptions. The most important part here is that we should actuall emit the stubs we refer to in the exception table, but as a side issue this uses more sensible & GCC compatible representations for some of the bits of information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206380 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 11:52:55 +00:00
Tim Northover	fef8e383eb	ARM64: use 32-bit moves for constants where possible. If we know that a particular 64-bit constant has all high bits zero, then we can rely on the fact that 32-bit ARM64 instructions automatically zero out the high bits of an x-register. This gives the expansion logic less constraints to satisfy and so sometimes allows it to pick better sequences. Came up while porting test/CodeGen/AArch64/movw-consts.ll: this will allow a 32-bit MOVN to be used in @test8 soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206379 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 11:52:51 +00:00
Tim Northover	ea9988a812	ARM64: use the integrated assembler on ELF. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206378 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 11:52:40 +00:00
Matheus Almeida	8e0f5768a6	[mips] Emit '.set nomicromips' before a function's entry label if not in micromips mode. The test (elf_st_other.ll) was renamed as the name and description didn't make sense as the test wasn't checking any symbol table entry. Differential Revision: http://reviews.llvm.org/D3346 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 11:46:59 +00:00
Daniel Sanders	b78eac2d4d	[mips] Correct callee saved list for the N32 ABI and enable test Summary: Depends on D3339 Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3340 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 10:23:37 +00:00
Daniel Sanders	c90bd4a882	[mips] Add calling convention tests covering O32, N32, and N64. Summary: I had difficulty finding tests for the N32 and N64 ABI so I've added a collection of calling convention tests based on the document MIPS ABIs Described (MD00305), the MIPSpro N32 Handbook, and the SYSV ABI. Where the documents/implementations disagree, I've used GCC to resolve the conflict. A few interesting details: * For N32, LLVM uses 64-bit pointers when saving $ra despite pointers being 32-bit. I've yet to find a supporting statement in the ABI documentation but the current behaviour matches GCC. * For O32, the non-variable portion of a varargs argument list is also subject to the rule that floating-point is passed via GPR's (on N32/N64 only the variable portion is subject to this rule). This agrees with GCC's behaviour and the SYSV ABI but contradicts part of the MIPSpro N32 Handbook which talks about O32's behaviour. * The N32 implementation has the wrong callee-saved register list. (I already have a fix for this but will commit it as a follow-up). I've left RUN-TODO lines in for O32 on MIPS64. I don't plan to support this case for now but we should revisit it. Reviewers: matheusalmeida, vmedic Reviewed By: matheusalmeida Differential Revision: http://reviews.llvm.org/D3339 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206370 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 09:59:46 +00:00
Tim Northover	946bea39dc	ARM64: explicitly ask for Apple NEON syntax so test passes on Linux git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206368 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 09:13:44 +00:00
Tim Northover	be50dc8b1f	ARM64: mark x7 as used when an i128 gets shunted onto the stack. The second half of a split i128 was ending up in x7, which is not a good thing. This is another part of PR19432. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206366 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 09:03:25 +00:00
Tim Northover	7474c171e1	DAGCombiner: don't optimise non-existant litpool load This particular DAG combine is designed to kick in when both ConstantFPs will end up being loaded via a litpool, however those nodes have a semi-legal status, dictated by isFPImmLegal so in some cases there wouldn't have been a litpool in the first place. Don't try to be clever in those circumstances. Picked up while merging some AArch64 tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206365 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 09:03:09 +00:00
Matt Arsenault	9fde950b1b	R600: Extend r600 sign_extend_inreg tests for EG Patch by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206349 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-16 01:41:34 +00:00
Matt Arsenault	bec5c611e1	R600/SI: Print more immediates in hex format Print in decimal for inline immediates, and hex otherwise. Use hex always for offsets in addressing offsets. This approximately matches what the shader compiler does. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206335 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 22:32:49 +00:00
Nick Lewycky	569e5a5b1c	Make this test not match its own filename, when being run from a path that includes the string 'add'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206331 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 22:29:32 +00:00
Matt Arsenault	f8ea0352e0	R600/SI: Fix loads of i1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206330 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 22:28:39 +00:00
Akira Hatanaka	f3930395f5	Make FastISel::SelectInstruction return before target specific fast-isel code handles Intrinsic::trap if TargetOptions::TrapFuncName is set. This fixes a bug in which the trap function was not taken into consideration when a program was compiled without optimization (at -O0). <rdar://problem/16291933> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206323 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 21:30:06 +00:00
Andrea Di Biagio	749e8fee34	[X86] Improve the lowering of packed shifts by constant build_vector. This patch teaches the backend how to efficiently lower logical and arithmetic packed shifts on both SSE and AVX/AVX2 machines. When possible, instead of scalarizing a vector shift, the backend should try to expand the shift into a sequence of two packed shifts by immedate count followed by a MOVSS/MOVSD. Example (v4i32 (srl A, (build_vector < X, Y, Y, Y>))) Can be rewritten as: (v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>))) [with X and Y ConstantInt] The advantage is that the two new shifts from the example would be lowered into X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into four scalar shifts plus four pairs of vector insert/extract. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206316 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 19:30:48 +00:00
Quentin Colombet	49ad5d5dd5	[ARM64] Set default CPU to generic instead of cyclone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206313 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 19:08:46 +00:00
Robert Lougher	4634a9ba1e	Revert r191049/r191059 as it can produce wrong code (see PR17975). It has already been reverted on the 3.4 branch in r196521. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206311 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 18:34:24 +00:00
Tim Northover	1e3f66afe8	AArch64/ARM64: enable more AArch64 tests on ARM64. No code changes for this bunch, just some test rejigs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206291 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:29 +00:00
Tim Northover	5f8234943d	AArch64/ARM64: add missing pattern for extending load. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206290 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:19 +00:00
Tim Northover	2a83cb71ad	AArch64/ARM64: only mangle MOVZ/MOVN during encoding when needed Sometimes we need emit the bits that would actually be a MOVN when producing a relocated MOVZ instruction (don't ask). But not always, a check which ARM64 got wrong until now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206289 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:15 +00:00
Tim Northover	5080ae2e21	AArch64/ARM64: add support for large code-model jump tables. I've left the MachO CodeGen as it is, there's a reasonable chance it should use the GOT like ConstPools, but I'm not certain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206288 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:11 +00:00
Tim Northover	e9cae9b79f	AArch64/ARM64: add patterns for various commutations of FNMADD. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206287 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:06 +00:00
Tim Northover	e8bc8a7d58	AArch64/ARM64: add half as a storage type on ARM64. This brings it into line with the AArch64 behaviour and should open the way for certain OpenCL features. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206286 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 14:00:03 +00:00
Tim Northover	380fa65d5d	AArch64/ARM64: copy patterns for fixed-point conversions Code is mostly copied directly across, with a slight extension of the ISelDAGToDAG function so that it can cope with the floating-point constants being behind a litpool. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206285 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 13:59:57 +00:00
Tim Northover	1291807e03	ARM64: add constraints to various FastISel operations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206284 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 13:59:53 +00:00
Tim Northover	f90c8c7063	AArch64/ARM64: add more arm64 lines to AArch64 regression tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206282 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 13:59:44 +00:00
Tim Northover	3f2d713f4d	AArch64/ARM64: add dp tests from AArch64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206281 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 13:59:40 +00:00
Quentin Colombet	e8603c43f2	[Register Coalescer] Add a test case for 206060. <rdar://problem/16582185> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206235 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-15 01:15:32 +00:00
Louis Gerbarg	261f0df185	Fix for codegen bug that could cause illegal cmn instruction generation In rare cases the dead definition elimination pass code can cause illegal cmn instructions when it replaces dead registers on instructions that use unmaterialized frame indexes. This patch disables the dead definition optimization for instructions which include frame index operands. rdar://16438284 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206208 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 21:05:05 +00:00
Louis Gerbarg	27539d46cc	Add a flag to disable the ARM64DeadRegisterDefinitionsPass This patch adds a -arm64-dead-def-elimination flag so that it is possible to disable dead definition elimination. Includes test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206207 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 21:05:02 +00:00
Akira Hatanaka	268c0509a9	Fix a bug in which BranchProbabilityInfo wasn't setting branch weights of basic blocks inside loops correctly. Previously, BranchProbabilityInfo::calcLoopBranchHeuristics would determine the weights of basic blocks inside loops even when it didn't have enough information to estimate the branch probabilities correctly. This patch fixes the function to exit early if it doesn't see any exit edges or back edges and let the later heuristics determine the weights. This fixes PR18705 and <rdar://problem/15991090>. Differential Revision: http://reviews.llvm.org/D3363 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206194 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 16:56:19 +00:00
Richard Trieu	b79d042e4e	Fix 2008-03-05-SxtInRegBug.ll so that the CHECK-NOT will not match the filename. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206193 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 16:53:50 +00:00
Daniel Sanders	6ede5d24b8	[mips] Fix fcopysign for MIPS-IV and add the test. Summary: This was another incorrect use of hasMips64() vs isGP64bit(). Depends on D3344 Reviewers: matheusalmeida, vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3347 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206187 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 16:24:12 +00:00
Daniel Sanders	0543dab791	[mips] MIPS-IV is broadly the same as MIPS64 so duplicate all -mcpu=mips64 tests with -mcpu=mips4 as a starting point Summary: Two exceptions to this: test/CodeGen/Mips/octeon.ll test/CodeGen/Mips/octeon_popcnt.ll these test extensions to MIPS64 One test is altered for MIPS-IV: test/CodeGen/Mips/mips64countleading.ll Tests dclo/dclz which were added in MIPS64. The MIPS-IV version tests that dclo/dclz are not emitted. Four tests fail and are not in this patch: test/CodeGen/Mips/abicalls.ll test/CodeGen/Mips/fcopysign-f32-f64.ll test/CodeGen/Mips/fcopysign.ll test/CodeGen/Mips/stack-alignment.ll Depends on D3343 Reviewers: matheusalmeida, vmedic Reviewed By: vmedic Differential Revision: http://reviews.llvm.org/D3344 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206185 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 16:00:28 +00:00
Daniel Sanders	327be6483d	[mips] Fix more incorrect uses of HasMips64 and isMips64() Summary: - Conditional moves acting on 64-bit GPR's should require MIPS-IV rather than MIPS64 - ISD::MUL, and ISD::MULH[US] should be lowered on all 64-bit ISA's Patch by David Chisnall His work was sponsored by: DARPA, AFRL I've added additional testcases to cover as much of the codegen changes affecting MIPS-IV as I can. Where I've been unable to find an existing MIPS64 testcase that can be re-used for MIPS-IV (mainly tests covering ISD::GlobalAddress and similar), I at least agree that MIPS-IV should behave like MIPS64. Further testcases that are fixed by this patch will follow in my next commit. The testcases from that commit that fail for MIPS-IV without this patch are: LLVM :: CodeGen/Mips/2010-07-20-Switch.ll LLVM :: CodeGen/Mips/cmov.ll LLVM :: CodeGen/Mips/eh-dwarf-cfa.ll LLVM :: CodeGen/Mips/largeimmprinting.ll LLVM :: CodeGen/Mips/longbranch.ll LLVM :: CodeGen/Mips/mips64-f128.ll LLVM :: CodeGen/Mips/mips64directive.ll LLVM :: CodeGen/Mips/mips64ext.ll LLVM :: CodeGen/Mips/mips64fpldst.ll LLVM :: CodeGen/Mips/mips64intldst.ll LLVM :: CodeGen/Mips/mips64load-store-left-right.ll LLVM :: CodeGen/Mips/sint-fp-store_pattern.ll Reviewers: dsanders Reviewed By: dsanders CC: matheusalmeida Differential Revision: http://reviews.llvm.org/D3343 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206183 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 15:44:42 +00:00
Tim Northover	f86c5472c0	ARM64: specify full triple in tests to pacify Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206175 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 13:18:48 +00:00
Tim Northover	069bcd7b38	AArch64: add newline to end of test files. Should be no other change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206174 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 13:18:40 +00:00
Tim Northover	856ecbc068	ARM64: remove buggy REV16 pattern. The 32-bit pattern is still valid: 0123 -> 3210 -> 1032. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206172 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:59:52 +00:00
Tim Northover	e90d4e2c69	AArch64/ARM64: enable directcond.ll test on ARM64. Code change is because optimizeCompareInstr didn't know how to pull the condition code out of FCSEL instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206171 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:51:06 +00:00
Tim Northover	97c1fc0832	ARM64: add patterns for csXYZ with reversed operands. AArch64 tests for this, and it's obviously a good idea. Have to invert the condition code, of course. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206170 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:51:02 +00:00
Tim Northover	dce294e7b1	ARM64: enable more regression tests from AArch64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206169 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:50:58 +00:00
Tim Northover	3c68c5c55e	ARM64: add support for AArch64's addsub_ext.ll There was one definite issue in ARM64 (the off-by-1 check for whether a shift could be folded in) and one difference that is probably correct: ARM64 didn't fold nodes with multiple uses into the arithmetic operations unless optimising for code size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206168 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:50:50 +00:00
Tim Northover	41b47904ba	ARM64: optimise (cmp x, (sub 0, y)) to (cmn x, y). This transformation is only valid when being used for an EQ or NE comparison since the flags change otherwise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206167 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:50:47 +00:00
Tim Northover	e23fabc2ac	ARM64: start porting regression test suite from AArch64 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206166 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:50:41 +00:00
Richard Osborne	c9fa660ddd	[XCore] Don't create invalid MKMSK instructions inside loadImmediate(). Summary: Previously loadImmediate() would produce MKMSK instructions with invalid immediate values such as mkmsk r0, 9. Fix this by checking the mask size is valid. Reviewers: robertlytton Reviewed By: robertlytton CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3289 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206163 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-14 12:30:35 +00:00
Hal Finkel	6a34916fbf	[PowerPC] Fix rlwimi isel when mask is not constant We had been using the known-zero values of the operand of the or to construct the mask for an rlwimi; this is not quite correct, but fine when the mask is constant. When the mask is constant, then the known zeros of the operand must be a superset of the zeros in the mask. However, when the mask is not a constant, then there might be bits in the operand that are not known to be zero that, at runtime, might be zero in the mask. Therefore, we check that any bits not known to be zero are known to be one in the mask. Otherwise, we can't fold the mask with the or and shift. This was revealed as a miscompile of MultiSource/Benchmarks/BitBench/drop3/drop3 when I started experimenting with constant hoisting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206136 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-13 17:10:58 +00:00
Hal Finkel	f4c3a5601a	[PowerPC] Implement some additional TLI callbacks Add implementations of: bool isLegalICmpImmediate(int64_t Imm) const bool isLegalAddImmediate(int64_t Imm) const bool isTruncateFree(Type Ty1, Type Ty2) const bool isTruncateFree(EVT VT1, EVT VT2) const bool shouldConvertConstantLoadToIntImm(const APInt &Imm, Type *Ty) const Unfortunately, this regresses counter-register-based loop formation because some of the loops now end up in forms were SE cannot compute loop counts. However, nevertheless, the test-suite results favor committing: SingleSource/Benchmarks/BenchmarkGame/puzzle: 26% speedup MultiSource/Benchmarks/FreeBench/analyzer/analyzer: 21% speedup MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan: 20% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/trisolv/trisolv: 19% speedup SingleSource/Benchmarks/Polybench/linear-algebra/kernels/gesummv/gesummv: 15% speedup MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2: 2% speedup MultiSource/Benchmarks/VersaBench/bmm/bmm: 26% slowdown git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206120 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 21:52:38 +00:00
Richard Trieu	6a871a361d	Add extra checks to mvn.ll test to prevent the "f1" check from matching on a directory name instead of the function name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206104 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 04:47:04 +00:00
Hal Finkel	e13b87c08d	Reenable use of TBAA during CodeGen We had disabled use of TBAA during CodeGen (even when otherwise using AA) because the ptrtoint/inttoptr used by CGP for address sinking caused BasicAA to miss basic type punning that it should catch (and, thus, we'd fail to override TBAA when we should). However, when AA is in use during CodeGen, CGP now uses normal GEPs and bitcasts, instead of ptrtoint/inttoptr, when doing address sinking. As a result, BasicAA should be able to make us do the right thing in the face of type-punning, and it seems safe to enable use of TBAA again. self-hosting seems fine on PPC64/Linux on the P7, with TBAA enabled and -misched=shuffle. Note: We still don't update TBAA when merging stack slots, although because BasicAA should now catch all such cases, this is no longer a blocking issue. Nevertheless, I plan to commit code to deal with this properly in the near future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206093 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 01:26:00 +00:00
Hal Finkel	24517d023f	Add the ability to use GEPs for address sinking in CGP The current memory-instruction optimization logic in CGP, which sinks parts of the address computation that can be adsorbed by the addressing mode, does this by explicitly converting the relevant part of the address computation into IR-level integer operations (making use of ptrtoint and inttoptr). For most targets this is currently not a problem, but for targets wishing to make use of IR-level aliasing analysis during CodeGen, the use of ptrtoint/inttoptr is a problem for two reasons: 1. BasicAA becomes less powerful in the face of the ptrtoint/inttoptr 2. In cases where type-punning was used, and BasicAA was used to override TBAA, BasicAA may no longer do so. (this had forced us to disable all use of TBAA in CodeGen; something which we can now enable again) This (use of GEPs instead of ptrtoint/inttoptr) is not currently enabled by default (except for those targets that use AA during CodeGen), and so aside from some PowerPC subtargets and SystemZ, there should be no change in behavior. We may be able to switch completely away from the ptrtoint/inttoptr sinking on all targets, but further testing is required. I've doubled-up on a number of existing tests that are sensitive to the address sinking behavior (including some store-merging tests that are sensitive to the order of the resulting ADD operations at the SDAG level). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206092 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-12 00:59:48 +00:00
Louis Gerbarg	5672630b7c	Add ARM64 CLS patterns This patch adds patterns to generate the cls instruction ARM64. Includes tests for 64 bit and 32 bit operands. rdar://15611957 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206079 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-11 22:27:58 +00:00
Quentin Colombet	010311b104	[RegAllocGreedy][Last Chance Recoloring] Change the name of the exhaustive search option. fexhaustive-register-search => exhaustive-register-search 'f' is a Clang thing! This is related to PR18747. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206075 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-11 21:51:09 +00:00
Quentin Colombet	92a892e8f9	[RegAllocGreedy][Last Chance Recoloring] Addition of -fexhaustive-register-search option to allow an exhaustive search during last chance recoloring. This is related to PR18747 Patch by MAYUR PANDEY <mayur.p@samsung.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-11 21:39:44 +00:00
Tom Stellard	e04360918b	SelectionDAG: Use helper function to improve legalization of ISD::MUL The TargetLowering::expandMUL() helper contains lowering code extracted from the DAGTypeLegalizer and allows the SelectionDAGLegalizer to expand more ISD::MUL patterns without having to use a library call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-11 16:12:01 +00:00
Reid Kleckner	bc1fd917f0	Move the segmented stack switch to a function attribute This removes the -segmented-stacks command line flag in favor of a per-function "split-stack" attribute. Patch by Luqman Aden and Alex Crichton! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 22:58:43 +00:00
Josh Magee	ee66766568	[stack protector] Refactor and clean-up test. No functionality change. Refactored stack-protector.ll to use new-style function attributes everywhere and eliminated unnecessary attributes. This cleanup is in preparation for an upcoming test change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205996 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 22:47:27 +00:00
Jim Grosbach	6472a514dc	X86: Tighten up test. llc CPU autodection bites again. Speculative fix for bot failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205940 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-10 00:27:43 +00:00
Jim Grosbach	afb4ef3549	Add support for load folding of avx1 logical instructions AVX supports logical operations using an operand from memory. Unfortunately because integer operations were not added until AVX2 the AVX1 logical operation's types were preventing the isel from folding the loads. In a limited number of cases the peephole optimizer would fold the loads, but most were missed. This patch adds explicit patterns with appropriate casts in order for these loads to be folded. The included test cases run on reduced examples and disable the peephole optimizer to ensure the folds are being pattern matched. Patch by Louis Gerbarg <lgg@apple.com> rdar://16355124 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205938 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 23:39:25 +00:00
Jim Grosbach	e9915738be	SelectionDAG: Don't constant fold target-specific nodes. FoldConstantArithmetic() only knows how to deal with a few target independent ISD opcodes. Bail early if it sees a target-specific ISD node. These node do funny things with operand types which may break the assumptions of the code that follows, and there's no actual folding that can be done anyway. For example, non-constant 256 bit vector shifts on X86 have a shift-amount operand that's a 128-bit v4i32 vector regardless of what the first operand type is and that breaks the assumption that the operand types must match. rdar://16530923 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205937 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 23:28:11 +00:00
Chad Rosier	c3de5ed072	[AArch64] Implement the isZExtFree APIs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205926 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 20:51:21 +00:00
Chad Rosier	fe5c9cee80	[AArch64] Implement the isTruncateFree API. In AArch64 i64 to i32 truncate operation is a subregister access. This allows more opportunities for LSR optmization to eliminate variables of different types (i32 and i64). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205925 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 20:43:40 +00:00
Quentin Colombet	e4b14ebcbd	[DAGCombiner] DAG combine does not know how to combine indexed loads with sign/zero/any extensions. However a few places were not checking properly the property of the load and were turning an indexed load into a regular extended load. Therefore the indexed value was lost during the process and this was triggering an assertion. <rdar://problem/16389332> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205923 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 20:03:05 +00:00
Justin Holewinski	77f268945e	[NVPTX] Add preliminary intrinsics and codegen support for textures/surfaces This commit adds intrinsics and codegen support for the surface read/write and texture read instructions that take an explicit sampler parameter. Codegen operates on image handles at the PTX level, but falls back to direct replacement of handles with kernel arguments if image handles are not enabled. Note that image handles are explicitly disabled for all target architectures in this change (to be enabled later). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205907 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 15:39:15 +00:00
Justin Holewinski	ac4c131de6	[NVPTX] Add support for addrspacecast in global variable initializers, including emitting generic() when casting to address space 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205906 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 15:39:11 +00:00
Alp Toker	46d36be2eb	Fix some doc and comment typos git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205899 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 14:47:27 +00:00
Bradley Smith	5a09ce9ad1	[ARM64] Rename LR to the UAL-compliant 'X30'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205885 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 14:43:59 +00:00
Bradley Smith	37fe6627f6	[ARM64] Rename FP to the UAL-compliant 'X29'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205884 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 14:43:50 +00:00
Elena Demikhovsky	0d5d656524	AVX-512: insert element to mask vector; store i1 data Implemented INSERT_VECTOR_ELT operation for v16i1 and v8i1 vectors; Implemented "store" for i1 type git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205850 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 12:37:50 +00:00
Daniel Sanders	e777fb4725	Re-commit: [mips] abs.[ds], and neg.[ds] should be allowed regardless of -enable-no-nans-fp-math Summary: They behave in accordance with the Has2008 and ABS2008 configuration bits of the processor which are used to select between the 1985 and 2008 versions of IEEE 754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid operation exceptions when given NaN), in 2008 mode they are non-arithmetic (i.e. they are copies). nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because the ISA spec does not explicitly state that they obey Has2008 and ABS2008. Fixed the issue with the previous version of this patch (r205628). A pre-existing 'let Predicate =' statement was removing some predicates that were necessary for FP64 to behave correctly. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3274 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205844 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 09:56:43 +00:00
Matt Arsenault	d4786ed1de	R600/SI: Match not instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205837 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 07:16:16 +00:00
Tim Northover	87a79507fa	ARM64: scalarize v1i64 mul operation This is the second part of fixing PR19367. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205836 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 07:07:02 +00:00
Tim Northover	7db3c63bb2	ARM64: add pattern for <1 x i64> custom not node. This should fix PR19367. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205835 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-09 06:55:39 +00:00
Juergen Ributzka	c6a7502a80	[Constant Hoisting][ARM64] Enable constant hoisting for ARM64. This implements the target-hooks for ARM64 to enable constant hoisting. This fixes <rdar://problem/14774662> and <rdar://problem/16381500>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205791 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-08 20:39:59 +00:00
Tim Northover	362090adf5	ARM64: fix fmsub patterns which assumed accum operand was first Confusingly, the NEON fmla instructions put the accumulator first but the scalar versions put it at the end (like the fma lib function & LLVM's intrinsic). This should fix PR19345, assuming there's only one issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205758 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-08 12:23:51 +00:00
Elena Demikhovsky	dbcb670605	AVX-512: Added fp_to_uint and uint_to_fp patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205754 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-08 07:24:02 +00:00
Reed Kotler	bb0572a5d1	Reverting commit r205628 due to mips64 issues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205741 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-07 22:11:40 +00:00
Tom Stellard	1d8c7eb225	R600/SI: Handle INSERT_SUBREG in SIFixSGPRCopies git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205732 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-07 19:45:45 +00:00
Tom Stellard	5c9bb7119a	R600: Match 24-bit arithmetic patterns in a Target DAGCombine Moving these patterns from TableGen files to PerformDAGCombine() should allow us to generate better code by eliminating unnecessary shifts and extensions earlier. This also fixes a bug where the MAD pattern was calling SimplifyDemandedBits with a 24-bit mask on the first operand even when the full pattern wasn't being matched. This occasionally resulted in some instructions being incorrectly deleted from the program. v2: - Fix bug with 64-bit mul git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205731 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-07 19:45:41 +00:00
NAKAMURA Takumi	2e858fcaad	Quick fix: Triple::isOSMSVCRT() should be false for targeting cygwin. It affected callee's stack pop in x86. It is one of devergences between cygwin and mingw since mingw-gcc-4.6. Added testcases to llvm/test/CodeGen/X86/win32_sret.ll for cygwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205688 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-06 10:01:23 +00:00
Hal Finkel	b12c642bbf	[PowerPC] Add a full condition code register to make the "cc" clobber work gcc inline asm supports specifying "cc" as a clobber of all condition registers. Add just enough modeling of the full register to make this work. Fixed PR19326. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205630 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 15:15:57 +00:00
Daniel Sanders	dc404fff12	[mips] abs.[ds], and neg.[ds] should be allowed regardless of -enable-no-nans-fp-math Summary: They behave in accordance with the Has2008 and ABS2008 configuration bits of the processor which are used to select between the 1985 and 2008 versions of IEEE 754. In 1985 mode, these instructions are arithmetic (i.e. they raise invalid operation exceptions when given NaN), in 2008 mode they are non-arithmetic (i.e. they are copies). nmadd.[ds], and nmsub.[ds] are still subject to -enable-no-nans-fp-math because the ISA spec does not explicitly state that they obey Has2008 and ABS2008. Reviewers: matheusalmeida Reviewed By: matheusalmeida Differential Revision: http://llvm-reviews.chandlerc.com/D3274 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205628 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 14:52:54 +00:00
Tim Northover	659e09e372	DAGLegalize: add last-ditch type-legalization for VSELECT. When LLVM sees something like (v1iN (vselect v1i1, v1iN, v1iN)) it can decide that the result is OK (v1i64 is legal on AArch64, for example) but it still need scalarising because of that v1i1. There was no code to do this though. AArch64 and ARM64 have DAG combines to produce efficient code and prevent that occuring in most such situations, but there are edge cases that they miss. This adds a legalization to cope with that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205626 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 14:49:30 +00:00
Tim Northover	4a4d62bfb9	ARM64: handle v1i1 types arising from setcc properly. There were several overlapping problems here, and this solution is closely inspired by the one adopted in AArch64 in r201381. Firstly, scalarisation of v1i1 setcc operations simply fails if the input types are legal. This is fixed in LegalizeVectorTypes.cpp this time, and allows AArch64 code to be simplified slightly. Second, vselect with such a setcc feeding into it ends up in ScalarizeVectorOperand, where it's not handled. I experimented with an implementation, but found that whatever DAG came out was rather horrific. I think Hao's DAG combine approach is a good one for quality, though there are edge cases it won't catch (to be fixed separately). Should fix PR19335. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205625 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 14:49:21 +00:00
Tim Northover	0eb313be18	ARM64: use regalloc-friendly COPY_TO_REGCLASS for bitcasts The previous patterns directly inserted FMOV or INS instructions into the DAG for scalar_to_vector & bitconvert patterns. This is horribly inefficient and can generated lots more GPR <-> FPR register traffic than necessary. It's much better to emit instructions the register allocator understands so it can coalesce the copies when appropriate. It led to at least one ISelLowering hack to avoid the problems, which was incorrect for v1i64 (FPR64 has no dsub). It can now be removed entirely. This should also fix PR19331. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205616 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 09:03:09 +00:00
Tim Northover	604dff27c9	ARM64: add 128-bit MLA operations to the custom selection code. Without this change, the llvm_unreachable kicked in. The code pattern being spotted is rather non-canonical for 128-bit MLAs, but it can happen and there's no point in generating sub-optimal code for it just because it looks odd. Should fix PR19332. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205615 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 09:03:02 +00:00
Quentin Colombet	3b2b5dfa9b	[RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance recoloring cut-offs are encountered and register allocation failed. This is related to PR18747 Patch by MAYUR PANDEY <mayur.p@samsung.com>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205601 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 02:05:21 +00:00
Quentin Colombet	cc99615837	Revert r205599, the commit was not intended to have so many changes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205600 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 02:02:49 +00:00
Quentin Colombet	c65a77b92d	[RegAllocGreedy][Last Chance Recoloring] Emit diagnostics when last chance recoloring cut-offs are hit. This is related to PR18747. Patch by MAYUR PANDEY <mayur.p@samsung.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205599 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 01:58:57 +00:00
Saleem Abdulrasool	09ebd6d735	ARM: fix test case missed in previous roundup This should hopefully bring the last MSVC buildbot back to green! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205596 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-04 01:19:56 +00:00
Saleem Abdulrasool	2abadea537	ARM: yet another round of ARM test clean ups git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205586 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 23:47:24 +00:00
Eli Bendersky	25540a7f39	Optimize away unnecessary address casts. Removes unnecessary casts from non-generic address spaces to the generic address space for certain code patterns. Patch by Jingyue Wu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205571 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 21:18:25 +00:00
Lang Hames	89218827c8	[ARM64] Teach the ARM64DeadRegisterDefinition pass to respect implicit-defs. When rematerializing through truncates, the coalescer may produce instructions with dead defs, but live implicit-defs of subregs: E.g. %X1<def,dead> = MOVi64imm 2, %W1<imp-def>; %X1:GPR64, %W1:GPR32 These instructions are live, and their definitions should not be rewritten. Fixes <rdar://problem/16492408> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205565 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 20:51:08 +00:00
Tom Stellard	a7469745de	R600: Correct opcode for BFE_INT Acording to AMD documentation, the correct opcode for BFE_INT is 0x5, not 0x4 Fixes Arithm/Absdiff.Mat/3 OpenCV test Patch by: Bruno Jiménez git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205562 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 20:19:29 +00:00
Tom Stellard	50c16fb65c	R600/SI: Lower 64-bit immediates using REG_SEQUENCE git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205561 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 20:19:27 +00:00
NAKAMURA Takumi	df34283c2a	llvm/test/CodeGen/X86/peephole-multiple-folds.ll: Relax expressions to satisfy win32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205559 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 20:07:51 +00:00
Saleem Abdulrasool	5fe5b3dcc8	ARM: update even more tests More updating of tests to be explicit about the target triple rather than relying on the default target triple supporting ARM mode. Indicate to lit that object emission is not yet available for Windows on ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205545 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 17:35:22 +00:00
Saleem Abdulrasool	27b1252c13	ARM: fixup more tests to specify the target more explicitly This changes the tests that were targeting ARM EABI to explicitly specify the environment rather than relying on the default. This breaks with the new Windows on ARM support when running the tests on Windows where the default environment is no longer EABI. Take the opportunity to avoid a pointless redirect (helps when trying to debug with providing a command line invocation which can be copy and pasted) and removing a few greps in favour of FileCheck. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205541 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 16:01:44 +00:00
Tim Northover	d5561bb1f0	ARM: tell LLVM about zext properties of ldrexb/ldrexh Implementing this via ComputeMaskedBits has two advantages: + It actually works. DAGISel doesn't deal with the chains properly in the previous pattern-based solution, so they never trigger. + The information can be used in other DAG combines, as well as the trivial "get rid of truncs". For example if the trunc is in a different basic block. rdar://problem/16227836 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205540 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 15:10:35 +00:00
Tim Northover	3eb87654a5	ARM: skip cmpxchg failure barrier if ordering is monotonic. The terminal barrier of a cmpxchg expansion will be either Acquire or SequentiallyConsistent. In either case it can be skipped if the operation has Monotonic requirements on failure. rdar://problem/15996804 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205535 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 13:06:54 +00:00
Tim Northover	badb137729	ARM: expand atomic ldrex/strex loops in IR The previous situation where ATOMIC_LOAD_WHATEVER nodes were expanded at MachineInstr emission time had grown to be extremely large and involved, to account for the subtly different code needed for the various flavours (8/16/32/64 bit, cmpxchg/add/minmax). Moving this transformation into the IR clears up the code substantially, and makes future optimisations much easier: 1. an atomicrmw followed by using the new value can be more efficient. As an IR pass, simple CSE could handle this efficiently. 2. Making use of cmpxchg success/failure orderings only has to be done in one (simpler) place. 3. The common "cmpxchg; did we store?" idiom can be exposed to optimisation. I intend to gradually improve this situation within the ARM backend and make sure there are no hidden issues before moving the code out into CodeGen to be shared with (at least ARM64/AArch64, though I think PPC & Mips could benefit too). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205525 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 11:44:58 +00:00
Silviu Baranga	3f11cd0d25	[ARM] When generating a vpaddl node the input lane type is not always the type of the add operation since extract_vector_elt can perform an extend operation. Get the input lane type from the vector on which we're performing the vpaddl operation on and extend or truncate it to the output type of the original add node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205523 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 10:44:27 +00:00
Tim Northover	107283d7cb	ARM64: add regression test for r205519. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205520 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 09:36:05 +00:00
Tim Northover	b642eb5dbc	ARM64: don't generate __sincos_stret calls unless on MachO This should fix PR19314. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205514 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 07:06:13 +00:00
Richard Trieu	1498ceee9e	Fix test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205492 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-03 00:14:18 +00:00
Lang Hames	ed2154b816	[CodeGen] Teach the peephole optimizer to remember (and exploit) all folding opportunities in the current basic block, rather than just the last one seen. <rdar://problem/16478629> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205481 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 22:59:58 +00:00
Juergen Ributzka	75cea2c73e	Add comments and test case for [DAG] Keep the opaque constant flag when performing unary constant folding operations (r204737). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205474 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 22:21:01 +00:00
Saleem Abdulrasool	6d4e5ab349	ARM: fixup tests to specify the target more explicitly This changes the tests that were targeting ARM EABI to explicitly specify the environment rather than relying on the default. This breaks with the new Windows on ARM support when running the tests on Windows where the default environment is no longer EABI. Take the opportunity to avoid a pointless redirect (helps when trying to debug with providing a command line invocation which can be copy and pasted) and removing a few greps in favour of FileCheck. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205465 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 21:22:03 +00:00
Saleem Abdulrasool	396e5e328c	ARM: update subtarget information for Windows on ARM Update the subtarget information for Windows on ARM. This enables using the MC layer to target Windows on ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205459 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 20:32:05 +00:00
Tom Stellard	adb852ddf3	TargetLibraryInfo: Disable memcpy and memset on R600 There are no implementations of these for R600. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205455 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 19:53:29 +00:00
Jim Grosbach	acb6d9834a	ARM: cortex-m0 doesn't support unaligned memory access. Unlike other v6+ processors, cortex-m0 never supports unaligned accesses. From the v6m ARM ARM: "A3.2 Alignment support: ARMv6-M always generates a fault when an unaligned access occurs." rdar://16491560 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205452 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 19:28:13 +00:00
Kai Nacke	b96fc4a5ea	[mips] Add more Octeon cnMips instructions Adds the instructions ext/ext32/cins/cins32. It also changes pop/dpop to accept the two operand version and adds a simple pattern to generate baddu. Tests for the two operand versions (including baddu/dmul/dpop/pop) and the code generation pattern for baddu are included. Reviewed by: Daniel.Sanders@imgtec.com git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205449 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 18:40:43 +00:00
Oliver Stannard	af48fc4136	ARM: Add support for segmented stacks Patch by Alex Crichton, ILyoan, Luqman Aden and Svetoslav. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205430 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 16:10:33 +00:00
Tim Northover	6584d94610	ARM64: use GOT for weak symbols & PIC. Weak symbols cannot use the small code model's usual ADRP sequences since the instruction simply may not be able to encode a value of 0. This redirects them to use the GOT, which hopefully linkers are able to cope with even in the static relocation model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205426 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 14:39:11 +00:00
Tim Northover	671c92d886	ARM64: fix lowering of fp128 fptosi/fptoui We were creating libcall nodes that returned an MVT::f128, when these particular operations actually return an int of some stripe. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205425 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 14:39:07 +00:00
Tim Northover	3844cadc9a	ARM64: make sure first argument to INSERT_SUBVECTOR has right type. Again, coalescing and other optimisations swiftly made the MachineInstrs consistent again, but when compiled at -O0 a bad INSERT_SUBREGISTER was produced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205423 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 14:38:58 +00:00
Tim Northover	87e824120d	ARM64: convert fp16 narrowing ISel to pseudo-instruction The previous attempt was fine with optimisations, but was actually rather cavalier with its types. When compiled at -O0, it produced invalid COPY MachineInstrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205422 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 14:38:54 +00:00
Job Noorman	4e7ec2b053	Mark FPB as a reserved register when needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205421 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 13:13:56 +00:00
Renato Golin	421397ac00	Remove duplicated DMB instructions ARM specific optimiztion, finding places in ARM machine code where 2 dmbs follow one another, and eliminating one of them. Patch by Reinoud Elhorst. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205409 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-02 09:03:43 +00:00
Hal Finkel	4a6c0afc52	[PowerPC] Add some missing VSX bitcast patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205352 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 19:24:27 +00:00
Reid Kleckner	f319a2c3b3	Support segmented stacks on Win64 Identical to Win32 method except the GS segment register is used for TLS instead of FS and pvArbitrary is at TEB offset 0x28 instead of 0x14. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205342 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:34:21 +00:00
Matt Arsenault	a173c01556	Fix missing RUN line in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205341 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:34:13 +00:00
Matt Arsenault	eb8eac6be2	Make isSetCCEquivalent respect the TargetBooleanContents git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205336 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 18:13:26 +00:00
Tim Northover	c077472250	ARM: add cyclone CPU with ZeroCycleZeroing feature. The Cyclone CPU is similar to swift for most LLVM purposes, but does have two preferred instructions for zeroing a VFP register. This teaches LLVM about them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205309 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 13:22:02 +00:00
Tim Northover	ed839120c4	ARM64: add intrinsic for pmull (p64 x p64 = p128) operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205302 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 12:22:37 +00:00
Tim Northover	acfb679618	ARM64: add patterns for more lane-wise ld1/st1 operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205294 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 10:37:09 +00:00
Tim Northover	233c567ea9	ARM64: fix bug in ld3r (1d) SelectionDAG. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205293 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 10:37:03 +00:00
Alexey Volkov	1d75829ddc	[x86] Do not convert to cmp32 for Atom arch by Sergey Okunev Differential Revision: http://llvm-reviews.chandlerc.com/D2824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205288 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 08:13:07 +00:00
David Blaikie	a07c1ab4e6	DebugInfo: Avoid creating unnecessary/empty line tables and remove the special case of '0' in DwarfCompileUnit::initStmtList by just always using a label difference This moves one case of raw text checking down into the MCStreamer interfaces in the form of a virtual function, even if we ultimately end up consolidating on the one-or-many line tables issue one day, this is nicer in the interim. This just generally streamlines a bunch of use cases into a common code path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205287 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-01 08:07:52 +00:00
Juergen Ributzka	d8c9577764	[Stackmaps] Update the stackmap format to use 64-bit relocations for the function address and properly align all entries. This commit updates the stackmap format to version 1 to indicate the reorganizaion of several fields. This was done in order to align stackmap entries to their natural alignment and to minimize padding. Fixes <rdar://problem/16005902> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205254 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 22:14:04 +00:00
Matt Arsenault	193c3e91b9	R600: Compute masked bits for min and max git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205242 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 19:35:33 +00:00
Matt Arsenault	828bfc7350	R600: Add BFE, BFI, and BFM intrinsics to help with writing tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205236 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 18:21:18 +00:00
Hal Finkel	e2b3751924	[PowerPC] Don't ever expand BUILD_VECTOR of v2i64 with shuffles If we have two unique values for a v2i64 build vector, this will always result in two vector loads if we expand using shuffles. Only one is necessary. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205231 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 17:48:16 +00:00
Yaron Keren	b6f3f9ba45	Two updated tests for MinGW 32 and 64 exception handling code generation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205227 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 17:34:15 +00:00
Eli Bendersky	416a0e993f	Fix for PR19099 - NVPTX produces invalid symbol names. This is a more thorough fix for the issue than r203483. An IR pass will run before NVPTX codegen to make sure there are no invalid symbol names that can't be consumed by the ptxas assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205212 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:56:26 +00:00
Tim Northover	93b3fcae28	ARM64: add extra patterns for scalar shifts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205209 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:46 +00:00
Tim Northover	812a174ba4	ARM64: add extra scalar neg pattern & tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205208 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:42 +00:00
Tim Northover	b35ffdff16	ARM64: add patterns for scalar sqdmlal & sqdmlsl. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205207 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:38 +00:00
Tim Northover	9542846832	ARM64: add more patterns for commuted fmsub operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205206 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:34 +00:00
Tim Northover	4417e07e39	ARM64: shuffle patterns around for fmin/fmax & add tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205205 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:30 +00:00
Tim Northover	576c3f709f	ARM64: add more scalar patterns for usqadd & suqadd. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205204 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:26 +00:00
Tim Northover	13277a78bc	ARM64: add more scalar patterns for reciprocal ops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205203 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:22 +00:00
Tim Northover	8f93e159ed	ARM64: add i64 scalar pattern for @llvm.arm64.abs This will be used by the Clang front-end code for vabsd_s64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205202 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 15:46:17 +00:00
Tom Stellard	1f143aa3e9	R600/SI: Lower i64 SELECT by bitcasting to a vector type This allows allows us to replace ISD::EXTRACT_ELEMENT, which is lowered using shifts, with ISD::EXTRACT_VECTOR_ELT, which is a no-op. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205187 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 14:01:55 +00:00
Zoran Jovanovic	077aa54e4e	Fixed issue with microMIPS JAL instruction. Differential Revision: http://llvm-reviews.chandlerc.com/D3200 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205185 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 14:00:10 +00:00
Hal Finkel	65fafbb109	Look at shuffles of build_vectors in DAGCombiner::visitEXTRACT_VECTOR_ELT When the loop vectorizer vectorizes code that uses the loop induction variable, we often end up with IR like this: %b1 = insertelement <2 x i32> undef, i32 %v, i32 0 %b2 = shufflevector <2 x i32> %b1, <2 x i32> undef, <2 x i32> zeroinitializer %i = add <2 x i32> %b2, <i32 2, i32 3> If the add in this example is not legal (as is the case on PPC with VSX), it will be scalarized, and we'll end up with a number of extract_vector_elt nodes with the vector shuffle as the input operand, and that vector shuffle is fed by one or more build_vector nodes. By the time that vector operations are expanded, visitEXTRACT_VECTOR_ELT will not create new extract_vector_elt by looking through the vector shuffle (to make sure that no illegal operations are created), and so the extract_vector_elt -> vector shuffle -> build_vector is never simplified to an operand of the build vector. By looking at build_vectors through a shuffle we fix this particular situation, preventing a vector from being built, only to be deconstructed again (for the scalarized add) -- an expensive proposition when this all needs to be done via the stack. We probably want a more comprehensive fix here where we look back recursively through any shuffles to any build_vectors or scalar_to_vectors, etc. but that can come later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205179 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 11:43:19 +00:00
Daniel Sanders	dd67fd636c	[mips] Check emitted code for llvm.bswap.i32 on MIPS16/MIPS64 and llvm.bswap.i64 on MIPS16. While reviewing r204163, I noticed that the MIPS16 test only checked for a .ent directive and didn't actually check the code emitted. Fixed this and added a check for llvm.bswap.i32 on MIPS64 at the same time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205177 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 11:00:04 +00:00
Chandler Carruth	2530cd31f4	[ARM64] Fix materialization of an fp128 zero immediate. There currently is not a pattern to lower this with clever instructions that zero the register, so restrict the zero immediate legality special case to f64 and f32 (the only two sizes which fmov seems to directly support). Fixes backend errors when building code such as libxml. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205161 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-31 00:02:10 +00:00
Hal Finkel	111bcf9b59	Make use of previously generated stores in SelectionDAGLegalize::ExpandExtractFromVectorThroughStack When expanding EXTRACT_VECTOR_ELT and EXTRACT_SUBVECTOR using SelectionDAGLegalize::ExpandExtractFromVectorThroughStack, we store the entire vector and then load the piece we want. This is fine in isolation, but generating a new store (and corresponding stack slot) for each extraction ends up producing code of poor quality. When we scalarize a vector operation (using SelectionDAG::UnrollVectorOp for example) we generate one EXTRACT_VECTOR_ELT for each element in the vector. This used to generate one stored copy of the vector for each element in the vector. Now we search the uses of the vector for a suitable store before generating a new one, which results in much more efficient scalarization code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205153 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-30 15:10:18 +00:00
Hal Finkel	ee8e48d4c9	[PowerPC] Handle VSX v2i64 SIGN_EXTEND_INREG sitofp from v2i32 to v2f64 ends up generating a SIGN_EXTEND_INREG v2i64 node (and similarly for v2i16 and v2i8). Even though there are no sign-extension (or algebraic shifts) for v2i64 types, we can handle v2i32 sign extensions by converting two and from v2i64. The small trick necessary here is to shift the i32 elements into the right lanes before the i32 -> f64 step. This is because of the big Endian nature of the system, we need the i32 portion in the high word of the i64 elements. For v2i16 and v2i8 we can do the same, but we first use the default Altivec shift-based expansion from v2i16 or v2i8 to v2i32 (by casting to v4i32) and then apply the above procedure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205146 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-30 13:22:59 +00:00
NAKAMURA Takumi	a60d50cb9f	Suppress llvm/test/CodeGen/ARM64 for targeting pecoff. ARM64 is unaware of that. FIXME: Could we support them? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205126 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-30 05:01:17 +00:00
Hal Finkel	7563821402	[PowerPC] Handle v2i64 comparisons v2i64 is a legal type under VSX, however we don't have native vector comparisons. We can handle eq/ne by casting it to an Altivec type, but everything else must be expanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205106 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-29 16:04:40 +00:00
Tim Northover	7b837d8c75	ARM64: initial backend import This adds a second implementation of the AArch64 architecture to LLVM, accessible in parallel via the "arm64" triple. The plan over the coming weeks & months is to merge the two into a single backend, during which time thorough code review should naturally occur. Everything will be easier with the target in-tree though, hence this commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205090 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-29 10:18:08 +00:00
Hal Finkel	44b2b9dc1a	[PowerPC] Add subregister classes for f64 VSX values We had stored both f64 values and v2f64, etc. values in the VSX registers. This worked, but was suboptimal because we would always spill 16-byte values even through we almost always had scalar 8-byte values. This resulted in an increase in stack-size use, extra memory bandwidth, etc. To fix this, I've added 64-bit subregisters of the Altivec registers, and combined those with the existing scalar floating-point registers to form a class of VSX scalar floating-point registers. The ABI code has also been enhanced to use this register class and some other necessary improvements have been made. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205075 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-29 05:29:01 +00:00
Akira Hatanaka	b3cb36026a	[x86] Fix printing of register operands with q modifier. Emit 32-bit register names instead of 64-bit register names if the target does not have 64-bit general purpose registers. <rdar://problem/14653996> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205067 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-28 23:28:07 +00:00
David Majnemer	f3e8b0575d	X86: Disable IsLegalToCallImmediateAddr for Win32 WinCOFF cannot form PC relative relocations to support absolute MCValues. We should reenable this once WinCOFF supports emission of IMAGE_REL_I386_REL32 relocations. This fixes PR19272. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-28 21:40:47 +00:00
Hal Finkel	0e11c017a9	[PowerPC] Fix VSX permutation isel Not only did I invert the indices when I wrote the code, but I also did the same thing when I wrote the regression test. Oops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205046 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-28 20:24:55 +00:00
Hal Finkel	c9de9e60b9	[PowerPC] v2[fi]64 need to be explicitly passed in VSX registers v2[fi]64 values need to be explicitly passed in VSX registers. This is because the code in TRI that finds the minimal register class given a register and a value type will assert if given an Altivec register and a non-Altivec type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205041 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-28 19:58:11 +00:00
Hal Finkel	e2ee98ab16	[PowerPC] Use a small cleanup pass to remove VSX self copies As explained in r204976, because of how the allocation of VSX registers interacts with the call-lowering code, we sometimes end up generating self VSX copies. Specifically, things like this: %VSL2<def> = COPY %F2, %VSL2<imp-use,kill> (where %F2 is really a sub-register of %VSL2, and so this copy is a nop) This adds a small cleanup pass to remove these prior to post-RA scheduling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204980 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 23:12:31 +00:00
Hal Finkel	6bdc4ebedd	[PowerPC] Fix v2f64 vector extract and related patterns First, v2f64 vector extract had not been declared legal (and so the existing patterns were not being used). Second, the patterns for that, and for scalar_to_vector, should really be a regclass copy, not a subregister operation, because the VSX registers directly hold both the vector and scalar data. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204971 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-27 22:22:48 +00:00

... 2 3 4 5 6 ...

10535 Commits