llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-11-08 19:06:39 +00:00

Author	SHA1	Message	Date
Matt Arsenault	d14e5ec25d	R600/SI: Minor test scheduling fixes This prevents these from failing in a later commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229134 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 19:04:56 +00:00
Jozef Kolek	85e08ed8a4	[mips][microMIPS] Delay slot filler: Replace the microMIPS JR with the JRC This patch adds functionality in MIPS delay slot filler such as if delay slot filler have to put NOP instruction into the delay slot of microMIPS JR instruction, then instead of emitting NOP this instruction is replaced by compact jump instruction JRC. Differential Revision: http://reviews.llvm.org/D7522 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229128 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 17:51:27 +00:00
Andrea Di Biagio	59d115311a	[CodeGenPrepare] Removed duplicate logic. SimplifyCFG already knows how to speculate calls to cttz/ctlz. SimplifyCFG now knows how to speculate calls to intrinsic cttz/ctlz that are 'cheap' for the target. Therefore, some of the logic in CodeGenPrepare that was originally added at revision 224899 can now be removed. This patch is basically a no functional change. It removes the duplicated logic in CodeGenPrepare and converts all the existing target specific tests for cttz/ctlz into SimplifyCFG tests. Differential Revision: http://reviews.llvm.org/D7608 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229105 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 14:15:48 +00:00
James Molloy	c76bff0715	[SimplifyCFG] Be more aggressive Up the phi node folding threshold from a cheap "1" to a meagre "2". Update tests for extra added selects and slight code churn. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229099 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 10:48:30 +00:00
Chandler Carruth	00ae03a747	Revert a series of commits starting at r228886 which is triggering some regressions for LLDB on Linux. Rafael indicated on lldb-dev that we should just go ahead and revert these but that he wasn't at a computer. The patches backed out are as follows: r228980: Add support for having multiple sections with the name and ... r228889: Invert the section relocation map. r228888: Use the existing SymbolTableIndex intsead of doing a lookup. r228886: Create the Section -> Rel Section map when it is first needed. These patches look pretty nice to me, so hoping its not too hard to get them re-instated. =D git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229080 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 07:52:39 +00:00
Craig Topper	f3455f13a2	[X86] Add support for parsing and printing the mnemonic aliases for the XOP VPCOM instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229078 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 07:42:25 +00:00
Craig Topper	d4f1c60bf5	Fix probable typo in test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229070 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 06:07:27 +00:00
Craig Topper	db9343fb40	[X86] Remove int_x86_sse2_psll_dq_bs and int_x86_sse2_psrl_dq_bs intrinsics. The builtins aren't used by clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@229069 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-13 06:07:24 +00:00
Rafael Espindola	2fa06b171b	Add support for having multiple sections with the same name and comdat. Using this in combination with -ffunction-sections allows LLVM to output a .o file with mulitple sections named .text. This saves space by avoiding long unique names of the form .text.<C++ mangled name>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228980 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 23:29:51 +00:00
David Majnemer	73a92d5136	X86: Don't crash if we can't decode the pshufb mask Constant pool entries are uniqued by their contents regardless of their type. This means that a pshufb can have a shuffle mask which isn't a simple array of bytes. The code path which attempts to decode the mask didn't check for failure, causing PR22559. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228979 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 23:26:26 +00:00
Simon Pilgrim	6911b3bc37	Ensure integer domain on general shuffle stack folding tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228972 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 22:47:45 +00:00
David Blaikie	c3dc5bac73	Remove typedef of a pointer type used in a gep to simplify migration of geps to a typeless-pointer future. I'd modify my migration tool to account for this, but this is the only instance of a typedef'd pointer type to a gep I found in the whole test suite, so it didn't seem worthwhile. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228970 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 22:45:25 +00:00
Hal Finkel	8452d01224	[SDAG] Don't try to use FP_EXTEND/FP_ROUND for int<->fp promotions The PowerPC backend has long promoted some floating-point vector operations (such as select) to integer vector operations. Unfortunately, this behavior was broken by r216555. When using FP_EXTEND/FP_ROUND for promotions, we must check that both the old and new types are floating-point types. Otherwise, we must use BITCAST as we did prior to r216555 for everything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228969 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 22:43:52 +00:00
Reed Kotler	36068aae42	Add bulk of returning of values to Mips fast-isel Summary: Implement the bulk of returning values in Mips fast-isel Test Plan: reatabi.ll Passes test-suite at -O0,-O2 and with mips32r2 and mips32r1. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D5920 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228958 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 21:05:12 +00:00
Rafael Espindola	c3c5d7c2d6	On ELF, put PIC jump tables in a non executable section. Fixes PR22558. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228939 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 17:46:49 +00:00
Rafael Espindola	8eeedf74d3	Put each jump table in an independent section if the function is too. This allows the linker to GC both, fixing pr22557. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228937 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 17:16:46 +00:00
Michael Kuperstein	fb107d8bf0	[X86] Call frame optimization - allow stack-relative movs to be folded into a push Since we track esp precisely, there's no reason not to allow this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228924 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 14:17:35 +00:00
Elena Demikhovsky	f41b8e3e49	AVX-512: Fixed the "test" operation for i1 type Using KORTESTW for comparison i1 value with zero was wrong since the instruction tests 16 bits. KORTESTW may be used with KSHIFTL+KSHIFTR that clean the 15 upper bits. I removed (X86cmp i1, 0) pattern and zero-extend i1 to i8 and then use TESTB. There are some cases where i1 is in the mask register and the upper bits are already zeroed. Then KORTESTW is the better solution, but it is subject for optimization. Meanwhile, I'm fixing the correctness issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228916 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 08:40:34 +00:00
Michael Kuperstein	fd98d3be55	[X86] A heuristic to estimate the size impact for converting stack-relative parameter movs to pushes This gives a rough estimate of whether using pushes instead of movs is profitable, in terms of size. We go over all calls in the MachineFunction and compute: a) For each callsite that can not use pushes, the penalty of not having a reserved call frame. b) For each callsite that can use pushes, the gain of actually replacing the movs with pushes (and the potential penalty of having to readjust the stack). Differential Revision: http://reviews.llvm.org/D7561 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228915 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 08:36:35 +00:00
Ahmed Bougacha	9e9bde9b54	[CodeGen] Don't blindly combine (fp_round (fp_round x)) to (fp_round x). We used to do this DAG combine, but it's not always correct: If the first fp_round isn't a value preserving truncation, it might introduce a tie in the second fp_round, that wouldn't occur in the single-step fp_round we want to fold to. In other words, double rounding isn't the same as rounding. Differential Revision: http://reviews.llvm.org/D7571 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228911 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 06:15:29 +00:00
Hal Finkel	091acf253b	[PowerPC] Mark jumps as expensive (using using CR bits) On PowerPC, which has a full set of logical operations on (its multiple sets of) condition-register bits, it is not profitable to break of complex conditions feeding a jump into multiple jumps. We can turn off this feature of CGP/SDAGBuilder by marking jumps as "expensive". P7 test-suite speedups (no regressions): MultiSource/Benchmarks/FreeBench/pcompress2/pcompress2 -0.626647% +/- 0.323583% MultiSource/Benchmarks/Olden/power/power -18.2821% +/- 8.06481% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228895 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-12 01:02:52 +00:00
Tom Stellard	293dfe59a5	R600/SI: Disable subreg liveness This is temporary while we try to fix a crash in the register coalescer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228861 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 18:24:53 +00:00
Simon Pilgrim	d606d6bfe1	[X86][SSE] Added dual vector truncation tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228857 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 18:14:35 +00:00
Tom Stellard	946f5e91ef	R600/SI: Fix -march in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228848 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 17:11:48 +00:00
Sanjay Patel	cb2ff33a8a	fixed to test features, not CPUs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228836 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 15:00:41 +00:00
Sanjay Patel	00fb386b23	fixed to test features, not CPUs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228835 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 15:00:19 +00:00
Sanjay Patel	57fb1850e2	fixed to test features, not CPUs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228834 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 14:58:25 +00:00
Marek Olsak	c0021e43ea	R600/SI: Enable a lot of existing tests for VI (squashed commits) This is a union of these commits: * R600/SI: Enable more tests for VI which need no changes * R600/SI: Enable V_BCNT tests for VI Differences: - v_bcnt_..._e32 -> _e64 - s_load_dword* inline offset is in bytes instead of dwords * R600/SI: Enable all tests for VI which use S_LOAD_DWORD The inline offset is changed from dwords to bytes. * R600/SI: Enable LDS tests for VI Differences: - the s_load_dword inline offset changed from dwords to bytes - the tests checked very little on CI, so they have been fixed to check all instructions that "SI" checked * R600/SI: Enable lshr tests for VI * R600/SI: Fix divrem64 tests - "v_lshl_64" was missing "b" before "64" - added VI-NOT checks * R600/SI: Enable the SI.tid test for VI * R600/SI: Enable the frem test for VI Also, the frem_f64 checking is added for CI-VI. * R600/SI: Add VI tests for rsq.clamped git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228830 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 14:26:46 +00:00
James Molloy	1777ae7a05	Make buildbots better. This testcase change was associated incorrectly to a followup commit in my git tree, not the base commit. Sorry! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228827 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 12:24:09 +00:00
James Molloy	4de471dd0a	[SimplifyCFG] Swap to using TargetTransformInfo for cost analysis. We're already using TTI in SimplifyCFG, so remove the hard-baked "cheapness" heuristic and use TTI directly. Generally NFC intended, but we're using a slightly different heuristic now so there is a slight test churn. Test changes: * combine-comparisons-by-cse.ll: Removed unneeded branch check. * 2014-08-04-muls-it.ll: Test now doesn't branch but emits muleq. * coalesce-subregs.ll: Superfluous block check. * 2008-01-02-hoist-fp-add.ll: fadd is safe to speculate. Change to udiv. * PhiBlockMerge.ll: Superfluous CFG checking code. Main checks still present. * select-gep.ll: A variable GEP is not expensive, just TCC_Basic, according to the TTI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228826 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 12:15:41 +00:00
Tom Stellard	9f5d593c1f	R600/SI: Store immediate offsets > 12-bits in soffset This will save us from having to extend these offsets to 64-bits and storing them in a pair of vgprs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228776 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-11 00:34:35 +00:00
Petar Jovanovic	56e9c5f92b	Fix makeLibCall argument (signed) in SoftenFloatRes_XINT_TO_FP function The isSigned argument of makeLibCall function was hard-coded to false (unsigned). This caused zero extension on MIPS64 soft float. As the result SingleSource/Benchmarks/Stanford/FloatMM test and SingleSource/UnitTests/2005-07-17-INT-To-FP test failed. The solution was to use the proper argument. Patch by Strahinja Petrovic. Differential Revision: http://reviews.llvm.org/D7292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228765 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 23:30:14 +00:00
David Majnemer	f2138c2df8	X86: @llvm.frameaddress should defer to SelectionDAG for Win CFI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228754 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 22:00:34 +00:00
David Majnemer	420f72a301	X86: Make @llvm.frameaddress work correctly with Windows unwind codes Simply loading or storing the frame pointer is not sufficient for Windows targets. Instead, create a synthetic frame object that we will lower later. References to this synthetic object will be replaced with the correct reference to the frame address. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228748 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 21:22:05 +00:00
Daniel Jasper	363b645818	Fix overly prescriptive test that broken on Mac after r228725. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228742 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 20:49:05 +00:00
Bill Schmidt	49b3971b70	[PowerPC] Fix reverted patch r227976 to avoid register assignment issues See full discussion in http://reviews.llvm.org/D7491. We now hide the add-immediate and call instructions together in a separate pseudo-op, which is tagged to define GPR3 and clobber the call-killed registers. The PPCTLSDynamicCall pass prior to RA now expands this op into the two separate addi and call ops, with explicit definitions of GPR3 on both instructions, and explicit clobbers on the call instruction. The pass is now marked as requiring and preserving the LiveIntervals and SlotIndexes analyses, and fixes these up after the replacement sequences are introduced. Self-hosting has been verified on LE P8 and BE P7 with various optimization levels, etc. It has also been verified with the --no-tls-optimize flag workaround removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228725 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 19:09:05 +00:00
David Majnemer	3163865f01	X86: Emit Win64 SaveXMM opcodes at the right offset in the right order Walk the instructions marked FrameSetup and consider any stores of XMM registers to the stack as needing a SaveXMM opcode. This fixes PR22521. Differential Revision: http://reviews.llvm.org/D7527 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228724 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 19:01:47 +00:00
Paul Robinson	a932cb6d09	Explicitly initialize a flag in a default constructor. Works around a Visual C++ issue. Patch by Douglas Yung! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228699 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 15:30:02 +00:00
Bradley Smith	cec93b661d	[ARM] Add armv6s[-]m as an alias to armv6[-]m git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228696 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 15:15:08 +00:00
Simon Pilgrim	c99d58d6c1	[X86][AVX2] Missing AVX2 memory folding instructions Added most of the missing vector folding patterns for AVX2 (as well as fixing the vpermpd and verpmq patterns) Differential Revision: http://reviews.llvm.org/D7492 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228688 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 13:22:57 +00:00
Simon Pilgrim	8bcc093da5	[X86][XOP] Added XOP memory folding patterns + tests This patch adds the complete AMD Bulldozer XOP instruction set to the memory folding pattern tables for stack folding, etc. Note: Many of the XOP instructions have multiple table entries as it can fold loads from different sources. Differential Revision: http://reviews.llvm.org/D7484 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228685 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 12:57:17 +00:00
Andrea Di Biagio	bd1729e5d4	[X86][FastIsel] Avoid introducing legacy SSE instructions if the target has AVX. This patch teaches X86FastISel how to select AVX instructions for scalar float/double convert operations. Before this patch, X86FastISel always selected legacy SSE instructions for FPExt (from float to double) and FPTrunc (from double to float). For example: \code define double @foo(float %f) { %conv = fpext float %f to double ret double %conv } \end code Before (with -mattr=+avx -fast-isel) X86FastIsel selected a CVTSS2SDrr which is legacy SSE: cvtss2sd %xmm0, %xmm0 With this patch, X86FastIsel selects a VCVTSS2SDrr instead: vcvtss2sd %xmm0, %xmm0, %xmm0 Added test fast-isel-fptrunc-fpext.ll to check both the register-register and the register-memory float/double conversion variants. Differential Revision: http://reviews.llvm.org/D7438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228682 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 12:04:41 +00:00
Nick Lewycky	3c5236ae68	Remove non-test files that appear to have been accidentally committed in r228641. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 02:39:17 +00:00
Chandler Carruth	1c7c2e8650	[x86] Fix PR22524: the DAG combiner was incorrectly handling illegal nodes when folding bitcasts of constants. We can't fold things and then check after-the-fact whether it was legal. Once we have formed the DAG node, arbitrary other nodes may have been collapsed to it. There is no easy way to go back. Instead, we need to test for the specific folding cases we're interested in and ensure those are legal first. This could in theory make this less powerful for bitcasting from an integer to some vector type, but AFAICT, that can't actually happen in the SDAG so its fine. Now, we only whitelist specific int->fp and fp->int bitcasts for post-legalization folding. I've added the test case from the PR. (Also as a note, this does not appear to be in 3.6, no backport needed) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228656 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 02:25:56 +00:00
David Majnemer	69114ee016	X86: Emit an ABI compliant prologue and epilogue for Win64 Win64 has specific contraints on what valid prologues and epilogues look like. This constraint is born from the flexibility and descriptiveness of Win64's unwind opcodes. Prologues previously emitted by LLVM could not be represented by the unwind opcodes, preventing operations powered by stack unwinding to successfully work. Differential Revision: http://reviews.llvm.org/D7520 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228641 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-10 00:57:42 +00:00
Colin LeMahieu	6194244842	[Hexagon] Factoring classes out of store patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228602 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 20:33:46 +00:00
Sanjay Patel	93411cf4f8	fixed to test features, not CPUs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228581 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 17:17:09 +00:00
Kit Barton	f60b0de42a	This change implements the following three logical vector operations: veqv (vector equivalence) vnand vorc I increased the AddedComplexity for these instructions to 500 to ensure they are generated instead of issuing other VSX instructions. Phabricator review: http://reviews.llvm.org/D7469 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228580 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 17:03:18 +00:00
Akira Hatanaka	522cf235e9	Fix a bug in DemoteRegToStack where a reload instruction was inserted into the wrong basic block. This would happen when the result of an invoke was used by a phi instruction in the invoke's normal destination block. An instruction to reload the invoke's value would get inserted before the critical edge was split and a new basic block (which is the correct insertion point for the reload) was created. This commit fixes the bug by splitting the critical edge before all the reload instructions are inserted. Also, hoist up the code which computes the insertion point to the only place that need that computation. rdar://problem/15978721 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228566 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-09 06:38:23 +00:00
Sanjay Patel	4616d7dd2f	fix test attributes; this is an SSE2 test, not a Nehalem test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228546 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 21:14:27 +00:00
Sanjay Patel	751e3f1f80	fix test attributes; this is an x86-64 test, not a Nehalem test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228545 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 21:10:40 +00:00
Sanjay Patel	714f3d3a0f	fix test attributes; these are SSE2 tests, not Nehalem tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228544 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 21:05:03 +00:00
Sanjay Patel	e755d452e0	fix test attributes; these are SSE2 tests, not Nehalem tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228541 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 20:50:58 +00:00
Sanjay Patel	78547012ac	fix test attributes; these are x86-64 tests, not Nehalem tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228536 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 20:05:53 +00:00
Sanjay Patel	8d32999929	fix test attributes; these are MMX tests, not Nehalem tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228535 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 20:01:12 +00:00
Sanjay Patel	7596cf2b66	fix test attributes; these are SSE2 tests, not Nehalem tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228534 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 19:50:55 +00:00
Sanjay Patel	c3803c8bc2	generalize test; nothing Nehalem-specific here git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228532 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 19:38:25 +00:00
Simon Pilgrim	c92ffedc5c	[X86][AVX2] AVX2 broadcast + permute memory folding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228528 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 18:33:13 +00:00
Tim Northover	848d812931	ARM & AArch64: teach LowerVSETCC that output type size may differ from input. While various DAG combines try to guarantee that a vector SETCC operation will have the same output size as input, there's nothing intrinsic to either creation or LegalizeTypes that actually guarantees it, so the function needs to be ready to handle a mismatch. Fortunately this is easy enough, just extend or truncate the naturally compared result. I couldn't reproduce the failure in other backends that I know have SIMD, so it's probably only an issue for these two due to shared heritage. Should fix PR21645. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228518 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-08 00:50:47 +00:00
Simon Pilgrim	437265ee96	[X86][AVX2] AVX2 integer stack folding tests. This adds tests for the remaining AVX2 instructions that currently support memory folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228513 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 23:28:16 +00:00
Simon Pilgrim	2134ae7f38	[X86][AVX] Added missing stack folding support + test for vptest ymm instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228509 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 21:44:06 +00:00
Simon Pilgrim	710e70bb70	[X86][SSE] Added missing stack folding tests for (v)mpsadbw instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228506 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 21:20:11 +00:00
Simon Pilgrim	3281412d2a	[X86] Force fp stack folding tests to keep to specific domain. General boolean instructions (AND, ANDN, OR, XOR) need to use a specific domain instruction (and not just the default). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228495 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 16:14:55 +00:00
Simon Pilgrim	bf4a435d0a	[X86][AVX2] More AVX2 integer stack folding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228494 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 16:07:27 +00:00
David Majnemer	fdac306a12	MC: Emit COFF section flags in the "proper" order COFF section flags are not idempotent: 'rd' will make a read-write section because 'd' implies write 'dr' will make a read-only section because 'r' disables write git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228490 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 08:26:40 +00:00
Hal Finkel	05bd43dc6e	[PowerPC] Handle loop predecessor invokes If a loop predecessor has an invoke as its terminator, and the return value from that invoke is used to determine the loop iteration space, then we can't insert a computation based on that value in the loop predecessor prior to the terminator (oops). If there's such an invoke, or just no predecessor for that matter, insert a new loop preheader. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228488 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-07 07:32:58 +00:00
Hal Finkel	9ce4011708	[PowerPC] Fixup incomplete revert of test/CodeGen/PowerPC/tls-pic.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 23:30:06 +00:00
Simon Pilgrim	148482dd6b	[X86][AVX2] Begun adding AVX2 integer stack folding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 23:12:15 +00:00
Hal Finkel	9168f717c9	Revert "r227976 - [PowerPC] Yet another approach to __tls_get_addr" and related fixups Unfortunately, even with the workaround of disabling the linker TLS optimizations in Clang restored (which has already been done), this still breaks self-hosting on my P7 machine (-O3 -DNDEBUG -mcpu=native). Bill is currently working on an alternate implementation to address the TLS issue in a way that also fully elides the linker bug (which, unfortunately, this approach did not fully), so I'm reverting this now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228460 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 23:07:40 +00:00
Reid Kleckner	6dc42dd2da	Don't dllexport declarations Fixes PR22488 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228411 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 17:59:49 +00:00
Michel Danzer	7097d17da0	R600/SI: Amend a test to ensure WQM is enabled for LDS in pixel shaders Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228374 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 02:51:29 +00:00
Michel Danzer	971f0f0071	R600/SI: Don't enable WQM for V_INTERP_* instructions v2 Doesn't seem necessary anymore. I think this was mostly compensating for not enabling WQM for texture sampling instructions. v2: Add test coverage Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228373 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 02:51:25 +00:00
Michel Danzer	a7879dcf33	R600/SI: Also enable WQM for image opcodes which calculate LOD v3 If whole quad mode isn't enabled for these, the level of detail is calculated incorrectly for pixels along diagonal triangle edges, causing artifacts. v2: Use a TSFlag instead of lots of switch cases v3: Add test coverage Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88642 Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228372 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-06 02:51:20 +00:00
Matthias Braun	3fd0775f06	AArch64: Make test more robust. Avoid the creation of select instructions which can result in different scheduling of the selects. I also added a bunch of additional store volatiles. Those avoid A CodeGen problem (bug?) where normalizes and denomarlizing the control moves all shift instructions into the first block where ISel can't match them together with the cmps. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228362 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 23:52:14 +00:00
Matthias Braun	b8b2dff046	X86: Test cleanup Use FileCheck, make it more consistent and do not rely on unoptimized or(cmp,cmp) getting combined for max to be matched. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228361 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 23:52:12 +00:00
Colin LeMahieu	71166427a3	[Hexagon] Simplifying and formatting several patterns. Changing a pattern multiply to be expanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228347 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 21:13:25 +00:00
Hal Finkel	b8a6712c27	[PowerPC] Prepare loops for pre-increment loads/stores PowerPC supports pre-increment load/store instructions (except for Altivec/VSX vector load/stores). Using these on embedded cores can be very important, but most loops are not naturally set up to use them. We can often change that, however, by placing loops into a non-canonical form. Generically, this means transforming loops like this: for (int i = 0; i < n; ++i) array[i] = c; to look like this: T p = array[-1]; for (int i = 0; i < n; ++i) ++p = c; the key point is that addresses accessed are pulled into dedicated PHIs and "pre-decremented" in the loop preheader. This allows the use of pre-increment load/store instructions without loop peeling. A target-specific late IR-level pass (running post-LSR), PPCLoopPreIncPrep, is introduced to perform this transformation. I've used this code out-of-tree for generating code for the PPC A2 for over a year. Somewhat to my surprise, running the test suite + externals on a P7 with this transformation enabled showed no performance regressions, and one speedup: External/SPEC/CINT2006/483.xalancbmk/483.xalancbmk -2.32514% +/- 1.03736% So I'm going to enable it on everything for now. I was surprised by this because, on the POWER cores, these pre-increment load/store instructions are cracked (and, thus, harder to schedule effectively). But seeing no regressions, and feeling that it is generally easier to split instructions apart late than it is to combine them late, this might be the better approach regardless. In the future, we might want to integrate this functionality into LSR (but currently LSR does not create new PHI nodes, so (for that and other reasons) significant work would need to be done). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228328 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 18:43:00 +00:00
Hal Finkel	885b67a5c3	[PowerPC] Generate pre-increment floating-point ld/st instructions PowerPC supports pre-increment floating-point load/store instructions, both r+r and r+i, and we had patterns for them, but they were not marked as legal. Mark them as legal (and add a test case). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228327 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 18:42:53 +00:00
Ahmed Bougacha	ec35069525	[CodeGen] Add hook/combine to form vector extloads, enabled on X86. The combine that forms extloads used to be disabled on vector types, because "None of the supported targets knows how to perform load and sign extend on vectors in one instruction." That's not entirely true, since at least SSE4.1 X86 knows how to do those sextloads/zextloads (with PMOVS/ZX). But there are several aspects to getting this right. First, vector extloads are controlled by a profitability callback. For instance, on ARM, several instructions have folded extload forms, so it's not always beneficial to create an extload node (and trying to match extloads is a whole 'nother can of worms). The interesting optimization enables folding of s/zextloads to illegal (splittable) vector types, expanding them into smaller legal extloads. It's not ideal (it introduces some legalization-like behavior in the combine) but it's better than the obvious alternative: form illegal extloads, and later try to split them up. If you do that, you might generate extloads that can't be split up, but have a valid ext+load expansion. At vector-op legalization time, it's too late to generate this kind of code, so you end up forced to scalarize. It's better to just avoid creating egregiously illegal nodes. This optimization is enabled unconditionally on X86. Note that the splitting combine is happy with "custom" extloads. As is, this bypasses the actual custom lowering, and just unrolls the extload. But from what I've seen, this is still much better than the current custom lowering, which does some kind of unrolling at the end anyway (see for instance load_sext_4i8_to_4i64 on SSE2, and the added FIXME). Also note that the existing combine that forms extloads is now also enabled on legal vectors. This doesn't have a big effect on X86 (because sext+load is usually combined to sext_inreg+aextload). On ARM it fires on some rare occasions; that's for a separate commit. Differential Revision: http://reviews.llvm.org/D6904 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228325 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 18:31:02 +00:00
Andrew Trick	c4ae8cbc5d	X86 ABI fix for return values > 24 bytes. The return value's address must be returned in %rax. i.e. the callee needs to copy the sret argument (%rdi) into the return value (%rax). This probably won't manifest as a bug when the caller is LLVM-compiled code. But it is an ABI guarantee and tools expect it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228321 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 18:09:05 +00:00
Tom Stellard	c7198528eb	R600/SI: Fix bug in TTI loop unrolling preferences We should be setting UnrollingPreferences::MaxCount to MAX_UINT instead of UnrollingPreferences::Count. Count is a 'forced unrolling factor', while MaxCount sets an upper limit to the unrolling factor. Setting Count to MAX_UINT was causing the loop in the testcase to be unrolled 15 times, when it only had a maximum of 4 iterations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228303 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 15:32:18 +00:00
Tom Stellard	041211cd79	R600/SI: Fix bug from insertion of llvm.SI.end.cf into loop headers The llvm.SI.end.cf intrinsic is used to mark the end of if-then blocks, if-then-else blocks, and loops. It is responsible for updating the exec mask to re-enable threads that had been masked during the preceding control flow block. For example: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf The bug fixed by this patch was one where the llvm.SI.end.cf intrinsic was being inserted into the header of loops. This would happen when an if block terminated in a loop header and we would end up with code like this: s_mov_b64 exec, 0x3 ; Initial exec mask s_mov_b64 s[0:1], exec ; Saved exec mask v_cmpx_gt_u32 exec, s[2:3], v0, 0 ; llvm.SI.if do_stuff() LOOP: ; Start of loop header s_or_b64 exec, exec, s[0:1] ; llvm.SI.end.cf <-BUG: The exec mask has the same value at the beginning of each loop iteration. do_stuff(); s_cbranch_execnz LOOP The fix is to create a new basic block before the loop and insert the llvm.SI.end.cf there. This way the exec mask is restored before the start of the loop instead of at the beginning of each iteration. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 15:32:15 +00:00
Bill Schmidt	202b6045bf	[PowerPC] Implement the vclz instructions for PWR8 Patch by Kit Barton. Add the vector count leading zeros instruction for byte, halfword, word, and doubleword sizes. This is a fairly straightforward addition after the changes made for vpopcnt: 1. Add the correct definitions for the various instructions in PPCInstrAltivec.td 2. Make the CTLZ operation legal on vector types when using P8Altivec in PPCISelLowering.cpp Test Plan Created new test case in test/CodeGen/PowerPC/vec_clz.ll to check the instructions are being generated when the CTLZ operation is used in LLVM. Check the encoding and decoding in test/MC/PowerPC/ppc_encoding_vmx.s and test/Disassembler/PowerPC/ppc_encoding_vmx.txt respectively. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228301 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 15:24:47 +00:00
Bruno Cardoso Lopes	04715c9915	[X86][MMX] Handle i32->mmx conversion using movd Implement a BITCAST dag combine to transform i32->mmx conversion patterns into a X86 specific node (MMX_MOVW2D) and guarantee that moves between i32 and x86mmx are better handled, i.e., don't use store-load to do the conversion.. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228293 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 13:23:07 +00:00
Bruno Cardoso Lopes	d4299719af	[X86][MMX] Add several bitcast tests Avoid regression in previously supported MMX code by adding different combinations of tests which exercise MMX bitcasts. Small improvements to these patterns should come next. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228292 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 13:22:57 +00:00
Matt Arsenault	81eb6ca158	R600/SI: Fix i64 truncate to i1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228273 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 06:05:13 +00:00
Ahmed Bougacha	a7f2cf45f3	[ARM] Use patterns instead of hardcoded regs in test. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228259 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 01:52:19 +00:00
Ahmed Bougacha	42ec3433ef	[ARM] Make testcase more explicit. NFC. The q8/d16 thing is silly; I'd be happy to hear about a better way to write those tests where simple substitution isn't enough.. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228258 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-05 01:45:28 +00:00
Tom Stellard	26bfda9dd3	R600/SI: Enable subreg liveness by default git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228228 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 23:14:18 +00:00
Rafael Espindola	e247dd2839	Don' try to make sections in comdats SHF_MERGE. Parts of llvm were not expecting it and we wouldn't print the entity size of the section. Given what comdats are used for, having SHF_MERGE sections would be just a small improvement, so just disable it for now. Fixes pr22463. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228196 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 21:27:24 +00:00
Tom Stellard	89c96b1cd0	R600/SI: Expand misaligned 16-bit memory accesses git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228190 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 20:49:52 +00:00
Tom Stellard	fd4c349de2	R600/SI: Make more store operations legal v2i32, i32, trunc i32 to i16, and truc i32 to i8 stores are legal for all address spaces. We had marked them as custom in order to lower them for the private address space, but this is no longer necessary. This enables lowering of misaligned stores of these types in the DAGLegalizer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228189 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 20:49:51 +00:00
Tom Stellard	056a34916a	R600: Don't promote i64 stores to v2i32 during DAG legalization We take care of this during instruction selection now. This fixes a potential infinite loop when lowering misaligned stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 20:49:49 +00:00
Bill Schmidt	b9fc61d031	Add missing test case from r228046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228182 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 20:00:04 +00:00
Matthias Braun	a602c10686	MachineCSE: Clear dead-def flag on CSE. In case CSE reuses a previoulsy unused register the dead-def flag has to be cleared on the def operand, as exposed by the arm64-cse.ll test. This fixes PR22439 and the corresponding rdar://19694987 Differential Revision: http://reviews.llvm.org/D7395 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228178 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 19:35:16 +00:00
Michael Kuperstein	8f260e3084	Fixes a bug in vector load legalization that confused bits and bytes. Differential Revision: http://reviews.llvm.org/D7400 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228168 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 18:54:01 +00:00
Colin LeMahieu	47d6e4d009	[Hexagon] Adding encoding information for absolute-reg mode stores. Xfailing a test until constant extenders are correctly put in the same packet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228158 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 17:52:06 +00:00
Zoran Jovanovic	8dc0ae6606	[mips][microMIPS] Implement CodeGen support for SW16 and LW16 instructions Differential Revision: http://reviews.llvm.org/D6581 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 15:43:17 +00:00
Daniel Sanders	712010f655	[mips] Remove unused check prefix from tests. NFC. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7376 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228145 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 14:48:39 +00:00
Renato Golin	0966a4e370	Adding support to LLVM for targeting Cortex-A72 Currently, Cortex-A72 is modelled as an Cortex-A57 except the fp load balancing pass isn't enabled for Cortex-A72 as it's not profitable to have it enabled for this core. Patch by Ranjeet Singh. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228140 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 13:31:29 +00:00
Chandler Carruth	b0589710cc	[x86] Give movss and movsd execution domains in the x86 backend. This associates movss and movsd with the packed single and packed double execution domains (resp.). While this is largely cosmetic, as we now don't have weird ping-pong-ing between single and double precision, it is also useful because it avoids the domain fixing algorithm from seeing domain breaks that don't actually exist. It will also be much more important if we have an execution domain default other than packed single, as that would cause us to mix movss and movsd with integer vector code on a regular basis, a very bad mixture. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228135 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 10:58:53 +00:00
Chandler Carruth	886bbe2d76	[x86] Remove a low-value test that was just checking how we cleared a register. We have lots of tests covering this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228133 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 10:47:34 +00:00
Chandler Carruth	424a198c30	[x86] Mechanically update a bunch of tests' check lines using the latest version of the script. Changes include: - Using the VEX prefix - Skipping more detail when we have useful shuffle comments to match - Matching more shuffle comments that have been added to the printer (yay!) - Matching the destination registers of some AVX instructions - Stripping trailing whitespace that crept in - Fixing indentation issues Nothing interesting going on here. I'm just trying really hard to ensure these changes don't show up in the diffs with actual changes to the backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228132 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 10:46:53 +00:00
Renato Golin	ff01f89466	Reverting VLD1/VST1 base-updating/post-incrementing combining This reverts patches 223862, 224198, 224203, and 224754, which were all related to the vector load/store combining and were reverted/reaplied a few times due to the same alignment problems we're seeing now. Further tests, mainly self-hosting Clang, will be needed to reapply this patch in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228129 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 10:11:59 +00:00
Chandler Carruth	d16b9cd3d4	[x86] Include the destination register in the check-lines for AVX instructions. No actual change here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228127 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 09:18:27 +00:00
Chandler Carruth	82b686e611	[x86] Add some tests I missed in the prior commit to cover blends with zero for v8i16 as well. These exhibit the same domain badness, but also exhibit other weaknesses in our blend lowering. More fixes to come. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 09:15:46 +00:00
Chandler Carruth	da681cc578	[x86] Start to introduce bit-masking based blend lowering. This is the simplest form of bit-math based blending which only fires when we are blending with zero and is relatively profitable. I've only enabled this path on very specific lowering strategies. I'm planning to widen its applicability in subsequent patches, but so far you'll notice that even though we get fewer shufps instructions, we still do the bit math in the FP execution port. I'm looking into why this is still happening. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228124 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 09:06:05 +00:00
Chandler Carruth	6b1eacb0b5	[x86] Add tests for blends-with-zero on 4-element vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228122 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 09:05:58 +00:00
Bill Schmidt	89e8a17b4d	[PowerPC] Handle 32-bit targets properly in PPCTLSDynamicCall.cpp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228116 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 05:51:56 +00:00
Chandler Carruth	786f55c1fb	[x86] Refresh the checks of a number of tests using update_llc_test_checks.py. The exact format of the checks has changed over time. This includes different indenting rules, new shuffle comments that have been added, and more operand hiding behind regular expressions. No functional change to the tests are expected here, but this will make subsequent patches have a clean diff as they change shuffle lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228097 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 00:58:42 +00:00
Chandler Carruth	18ee73e456	[x86] Switch to using the long '--check-prefix' form which the update_llc_test_checks.py script uses, and refresh the checks in this test. No functionality changed here, just bringing this test up to work with automated updates using the python script. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228096 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 00:58:40 +00:00
Chandler Carruth	877ac0a034	[x86] Port this test to use utils/update_llc_test_checks.py. This will make it easy to update as I change some parts of the X86 backend, makes it more clear what instruction differences are introduced, and I find it makes it a bit easier to read as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228095 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 00:58:37 +00:00
Sanjay Patel	f1ac92a3b9	improved CHECK git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228086 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-04 00:24:06 +00:00
Simon Pilgrim	3d04e48cb6	[X86][SSE] psrl(w/d/q) and psll(w/d/q) bit shifts for SSE2 Patch to match cases where shuffle masks can be reduced to bit shifts. Similar to byte shift shuffle matching from D5699. Differential Revision: http://reviews.llvm.org/D6649 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228047 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:58:29 +00:00
Chandler Carruth	2e3524ec17	[x86] Add two truly horrific test cases for the new vector shuffle lowering. I'm prepping patches to improve these, and this will let the delta of those patches show the improvement. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228044 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:56:28 +00:00
Chandler Carruth	d5a61c2958	[x86] Update the indent and layout of some tests in this file. NFC This is just to remove voise from using the update_llc_test_checks script. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228043 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:56:24 +00:00
Marek Olsak	90eef42c8e	R600/SI: Remove the -CHECK suffix from all FileCheck prefixes in LIT tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228040 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:53:27 +00:00
Marek Olsak	e1a8ca95be	R600/SI: Fix B64 VALU shifts on VI SI only has standard versions. VI only has REV versions. Tested-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228037 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:53:01 +00:00
Chandler Carruth	dc5e49a1c4	[x86] Tweak my update script to use test case function names starting with 'stress' to indicate that the specific output isn't interesting and relax them to only check the last instruction (a ret). I've updated the one test case that really uses this to name the one 'stress_test' which was actually producing output we can directly check. With this, the script doesn't introduce noise when run over the v16 test file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228033 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 21:26:45 +00:00
Colin LeMahieu	3c159ed1a0	[Hexagon] Converting XTYPE/SHIFT intrinsics. Cleaning out old intrinsic patterns and updating tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228026 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 20:40:52 +00:00
Simon Pilgrim	646722d55f	[X86][SSE] Added general integer shuffle matching for MOVQ instruction This patch adds general shuffle pattern matching for the MOVQ zero-extend instruction (copy lower 64bits, zero upper) for all 128-bit integer vectors, it is added as a fallback test in lowerVectorShuffleAsZeroOrAnyExtend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228022 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 20:09:18 +00:00
Colin LeMahieu	861e105e61	[Hexagon] Updating XTYPE/PRED intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228019 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:43:59 +00:00
Colin LeMahieu	30f48c7dc4	[Hexagon] Updating XTYPE/PERM intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228015 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:36:59 +00:00
Simon Pilgrim	71a4e9522e	[X86][AVX2] Enabled shuffle matching for the AVX2 zero extension (128bit -> 256bit) vpmovzx* instructions. Differential Revision: http://reviews.llvm.org/D7251 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228014 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:34:09 +00:00
Rafael Espindola	f4e2998eda	Fix typo in test/CodeGen/X86/sibcall.ll (pr22331). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228011 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:20:26 +00:00
Colin LeMahieu	6217146dce	[Hexagon] Adding missing vector multiply instruction encodings. Converting multiply intrinsics and updating tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228010 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 19:15:11 +00:00
Sanjay Patel	9b4cc76745	Merge consecutive 16-byte loads into one 32-byte load (PR22329) This patch detects consecutive vector loads using the existing EltsFromConsecutiveLoads() logic. This fixes: http://llvm.org/bugs/show_bug.cgi?id=22329 This patch effectively reverts the tablegen additions of D6492 / http://reviews.llvm.org/rL224344 ...which in hindsight were a horrible hack. The test cases that were added with that patch are simply modified to load from varying offsets of a base pointer. These loads did not match the existing tablegen patterns. A happy side effect of doing this optimization earlier is that we can now fold the load into a math op where possible; this is shown in some of the updated checks in the test file. Differential Revision: http://reviews.llvm.org/D7303 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228006 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 18:54:00 +00:00
Colin LeMahieu	936986d12d	[Hexagon] Converting complex number intrinsics and adding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227995 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 18:16:28 +00:00
Colin LeMahieu	a3a588d983	[Hexagon] Adding vector intrinsics for alu32/alu and xtype/alu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227993 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 18:01:45 +00:00
Marek Olsak	a95296a86e	R600/SI: Don't generate non-existent LSHL, LSHR, ASHR B32 variants on VI This can happen when a REV instruction is commuted. The trick is not to define the _vi versions of instructions, which has these consequences: - code generation will always fail if a pseudo cannot be lowered (very useful to catch bugs where an unsupported instruction somehow makes it to the printer) - ability to query if a pseudo can be lowered, which is done in commuteOpcode to prevent REV from commuting to non-REV on VI Tested-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227990 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 17:38:12 +00:00
Marek Olsak	b19dbd9eb3	R600/SI: Fix dependency between instruction writing M0 and S_SENDMSG on VI (v2) This fixes a hang when using an empty geometry shader. v2: - don't add s_nop when followed by s_waitcnt - comestic changes Tested-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227986 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 17:37:52 +00:00
Sanjay Patel	3cf9267d4e	Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371). r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit. The commit log says that change did not affect anything, but that's not correct. That change allowed SSE instructions to have unaligned mem operands folded into math ops, and that's not allowed in the default specification for any SSE variant. The bug is exposed when compiling for an AVX-capable CPU that had this feature flag but without enabling AVX codegen. Another mistake in r224330 was not adding the feature flag to all AVX CPUs; the AMD chips were excluded. This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ). This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem". Changed the existing test case for the feature bit to reflect the new name and renamed the test file itself to better reflect the feature. Added runs to fold-vex.ll to check for the failing codegen. Note that the feature bit is not set by default on any CPU because it may require a configuration register setting to enable the enhanced unaligned behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227983 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 17:13:04 +00:00
Bill Schmidt	aeba87d6a6	Disable 32-bit tests in tls-pic.ll until they can be repaired git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227981 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 16:57:38 +00:00
Bill Schmidt	b32d6f455f	Further revise too-restrictive test CodeGen/PowerPC/tls-pic.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227980 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 16:33:55 +00:00
Bill Schmidt	f336df5f3f	Further revise too-restrictive test CodeGen/PowerPC/tls-pic.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227978 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 16:29:52 +00:00
Bill Schmidt	5114e12df0	Revise too-restrictive test CodeGen/PowerPC/tls-pic.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227977 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 16:24:05 +00:00
Bill Schmidt	1123a81009	[PowerPC] Yet another approach to __tls_get_addr This patch is a third attempt to properly handle the local-dynamic and global-dynamic TLS models. In my original implementation, calls to __tls_get_addr were hidden from view until the asm-printer phase, at which point the underlying branch-and-link instruction was created with proper relocations. This mostly worked well, but I used some repellent techniques to ensure that the TLS_GET_ADDR nodes at the SD and MI levels correctly received input from GPR3 and produced output into GPR3. This proved to work badly in the presence of multiple TLS variable accesses, with the copies to and from GPR3 being scheduled incorrectly and generally creating havoc. In r221703, I addressed that problem by representing the calls to __tls_get_addr as true calls during instruction lowering. This had the advantage of removing all of the bad hacks and relying on the existing call machinery to properly glue the copies in place. It looked like this was going to be the right way to go. However, as a side effect of the recent discovery of problems with linker optimizations for TLS, we discovered cases of suboptimal code generation with this strategy. The problem comes when tls_get_addr is called for the same address, and there is a resulting CSE opportunity. It turns out that in such cases MachineCSE will common the addis/addi instructions that set up the input value to tls_get_addr, but will not common the calls themselves. MachineCSE does not have any machinery to common idempotent calls. This is perfectly sensible, since presumably this would be done at the IR level, and introducing calls in the back end isn't commonplace. In any case, we end up with two calls to __tls_get_addr when one would suffice, and that isn't good. I presumed that the original design would have allowed commoning of the machine-specific nodes that hid the __tls_get_addr calls, so as suggested by Ulrich Weigand, I went back to that design and cleaned it up so that the copies were properly held together by glue nodes. However, it turned out that this didn't work either...the presence of copies to physical registers kept the machine-specific nodes from being commoned also. All of which leads to the design presented here. This is a return to the original design, except that no attempt is made to introduce copies to and from GPR3 during instruction lowering. Virtual registers are used until prior to register allocation. At that point, a special pass is run that identifies the machine-specific nodes that hide the tls_get_addr calls and introduces the copies to and from GPR3 around them. The register allocator then coalesces these copies away. With this design, MachineCSE succeeds in commoning tls_get_addr calls where possible, and we get nice optimal code generation (better than GCC at the moment, which does not common these calls). One additional problem must be dealt with: After introducing the mentions of the physical register GPR3, the aggressive anti-dependence breaker sees opportunities to improve scheduling by selecting a different register instead. Flags must be used on the instruction descriptions to tell the anti-dependence breaker to keep its hands in its pockets. One thing missing from the original design was recording a definition of the link register on the GET_TLS_ADDR nodes. Doing this was found to be insufficient to force a stack frame to be created, which led to looping behavior because two different LR values were stored at the same address. This appears to have been an oversight in PPCFrameLowering::determineFrameLayout(), which is repaired here. Because MustSaveLR() returns true for calls to builtin_return_address, this changed the expected behavior of test/CodeGen/PowerPC/retaddr2.ll, which now stacks a frame but formerly did not. I've fixed the test case to reflect this. There are existing TLS tests to catch regressions; the checks in test/CodeGen/PowerPC/tls-store2.ll proved to be too restrictive in the face of instruction scheduling with these changes, so I fixed that up. I've added a new test case based on the PrettyStackTrace module that demonstrated the original problem. This checks that we get correct code generation and that CSE of the calls to __get_tls_addr has taken place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227976 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 16:16:01 +00:00
Sanjay Patel	ec60318bf5	Improve test to actually check for a folded load. This test was checking for lack of a "movaps" (an aligned load) rather than a "movups" (an unaligned load). It also included a store which complicated the checking. Add specific CPU runs to prevent subtarget feature flag overrides from inhibiting this optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227972 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 15:37:18 +00:00
Bruno Cardoso Lopes	7df357f552	[X86][MMX] Improve transfer from mmx to i32 Improve EXTRACT_VECTOR_ELT DAG combine to catch conversion patterns between x86mmx and i32 with more layers of indirection. Before: movq2dq %mm0, %xmm0 movd %xmm0, %eax After: movd %mm0, %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227969 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-03 14:46:49 +00:00
Alex Rosenberg	cba5c599e8	Revert part of r227437 as it was unnecessary. Thanks to echristo for pointing this out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227897 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 23:58:54 +00:00
Bruno Cardoso Lopes	d821e0a5cc	[X86][MMX] Add tests for MMX extract element LLVM ToT produces poor MMX code compared to 3.5. However, part of the previous functionality can be achieved by using -x86-experimental-vector-widening-legalization. Add tests to be sure we don't regress again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227869 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 22:00:48 +00:00
Bruno Cardoso Lopes	12c944ba10	[X86][MMX] Cleanup shuffle, bitcast and insert element tests - Merge MMX arg passing test files - Merge MMX bitcast, insert elt and shuffle tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227867 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 21:56:11 +00:00
Tom Stellard	d73d1062fe	R600/SI: 64-bit and larger memory access must be at least 4-byte aligned This is true for SI only. CI+ supports unaligned memory accesses, but this requires driver support, so for now we disallow unaligned accesses for all GCN targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227822 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 18:02:28 +00:00
Tom Stellard	80e70ee18e	R600/SI: Merge two test files git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227821 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 18:02:23 +00:00
Ahmed Bougacha	0f1a21bcb8	[AArch64] Prefer DUP/MOV ("CPY") to INS for vector_extract. This avoids a partial false dependency on the previous content of the upper lanes of the destination vector register. Differential Revision: http://reviews.llvm.org/D7307 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227820 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 17:55:57 +00:00
Sanjay Patel	f766946abd	fix typo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227815 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 17:47:30 +00:00
Jan Wen Voung	1a63641597	Fix ARM peephole optimizeCompare to avoid optimizing unsigned cmp to 0. Summary: Previously it only avoided optimizing signed comparisons to 0. Sometimes the DAGCombiner will optimize the unsigned comparisons to 0 before it gets to the peephole pass, but sometimes it doesn't. Fix for PR22373. Test Plan: test/CodeGen/ARM/sub-cmp-peephole.ll Reviewers: jfb, manmanren Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D7274 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227809 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-02 16:56:50 +00:00
Hal Finkel	3bafb64914	[PowerPC] VSX stores don't also read The VSX store instructions were also picking up an implicit "may read" from the default pattern, which was an intrinsic (and we don't currently have a way of specifying write-only intrinsics). This was causing MI verification to fail for VSX spill restores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227759 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 19:07:41 +00:00
Hal Finkel	ec716cecda	[PowerPC] Better scheduling for isel on P7/P8 isel is actually a cracked instruction on the P7/P8, and must start a dispatch group. The scheduling model should reflect this so that we don't bunch too many of them together when possible. Thanks to Bill Schmidt and Pat Haugen for helping to sort this out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227758 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 17:52:16 +00:00
Michael Kuperstein	acd5f13c88	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. (Re-commit of r227728) Differential Revision: http://reviews.llvm.org/D6789 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227752 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 16:56:04 +00:00
Michael Kuperstein	5b61b8f53c	Revert r227728 due to bad line endings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227746 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 16:15:07 +00:00
Hal Finkel	8f5c829c1e	[PowerPC] Make r2 allocatable on PPC64/ELF for some leaf functions The TOC base pointer is passed in r2, and we normally reserve this register so that we can depend on it being there. However, for leaf functions, and specifically those leaf functions that don't do any TOC access of their own (which is generally due to accessing the constant pool, using TLS, etc.), we can treat r2 as an ordinary callee-saved register (it must be callee-saved because, for local direct calls, the linker will not insert any save/restore code). The allocation order has been changed slightly for PPC64/ELF systems to put r2 at the end of the list (while leaving it near the beginning for Darwin systems to prevent unnecessary output changes). While r2 is allocatable, using it still requires spill/restore traffic, and thus comes at the end of the list. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227745 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 15:03:28 +00:00
Michael Kuperstein	59d9986259	[X86] Convert esp-relative movs of function arguments to pushes, step 2 This moves the transformation introduced in r223757 into a separate MI pass. This allows it to cover many more cases (not only cases where there must be a reserved call frame), and perform rudimentary call folding. It still doesn't have a heuristic, so it is enabled only for optsize/minsize, with stack alignment <= 8, where it ought to be a fairly clear win. Differential Revision: http://reviews.llvm.org/D6789 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227728 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 11:44:44 +00:00
Elena Demikhovsky	516052acd3	AVX2: Added 2 more tests for gather intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227718 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 08:52:15 +00:00
Jingyue Wu	f15b696b79	[NVPTX] Emit .pragma "nounroll" for loops marked with nounroll Summary: CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA driver from unrolling a loop marked with llvm.loop.unroll.disable is not unrolled by CUDA driver, we need to emit .pragma "nounroll" at the header of that loop. This patch also extracts getting unroll metadata from loop ID metadata into a shared helper function. Test Plan: test/CodeGen/NVPTX/nounroll.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227703 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 02:27:45 +00:00
Matt Arsenault	9061eb6d2e	R600/SI: Only select cvt_flr/cvt_rpi with no NaNs. These have different behavior from cvt_i32_f32 on NaN. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227693 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-31 21:28:13 +00:00
Simon Pilgrim	982005c23e	[X86][SSE] Shuffle mask decode support for zero extend, scalar float/double moves and integer load instructions This patch adds shuffle mask decodes for integer zero extends (pmovzx** and movq xmm,xmm) and scalar float/double loads/moves (movss/movsd). Also adds shuffle mask decodes for integer loads (movd/movq). Differential Revision: http://reviews.llvm.org/D7228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227688 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-31 14:09:36 +00:00
Saleem Abdulrasool	81674807cc	ARM: support stack probe size on Windows on ARM Now that -mstack-probe-size is piped through to the backend via the function attribute as on Windows x86, honour the value to permit handling of non-default values for stack probes. This is needed /Gs with the clang-cl driver or -mstack-probe-size with the clang driver when targeting Windows on ARM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227667 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-31 02:26:37 +00:00
Ahmed Bougacha	cfb61b368c	[AArch64] Add a few more DUP testcases. NFC. Also, don't lie about testing index 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227642 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 23:41:15 +00:00
Ahmed Bougacha	f2045fb1f2	[AArch64] Robustize neon-scalar-copy.ll tests. NFC. Some of those didn't even have run lines: they were removed inadvertently during the Great Merge of 2014. They used to check for DUPs, but now we go through W-regs? Filed PR22418 for that potential regression. For now, just make the tests explicit, so we now where we stand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227635 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 23:13:57 +00:00
Ahmed Bougacha	ee93f014cc	[X86] Cleanup tabs in test vector-zext.ll. NFC. Some tests have tabs, some don't. In vector-[sz]ext.ll, space wins (well duh!). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227615 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 21:41:28 +00:00
Reid Kleckner	e359929517	Win64: Put a REX_W prefix on all TAILJMP* instructions MSDN's x64 software conventions page says that this is one of the fixed list of legal epilogues: https://msdn.microsoft.com/en-us/library/tawsa7cb.aspx Presumably this is how the unwinder distinguishes epilogue jumps from in-function control flow. Also normalize the way we place "## TAILCALL" comments on such jumps. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227611 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 21:03:31 +00:00
Hao Liu	7d3a44a692	[AArch64]Fix PR21675, a bug about lowering llvm.ctpop.i32. We should noot use "DAG.getUNDEF(MVT::v8i8)" to get all zero vector. Patch by Wei-cheng Wang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227550 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-30 02:13:53 +00:00
Reid Kleckner	c9fbc97e95	x86: Fix large model calls to __chkstk for dynamic allocas In the large code model, we now put __chkstk in %r11 before calling it. Refactor the code so that we only do this once. Simplify things by using __chkstk_ms instead of __chkstk on cygming. We already use that symbol in the prolog emission, and it simplifies our logic. Second half of PR18582. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227519 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 23:58:04 +00:00
Reid Kleckner	cb867e4ac4	Update comments to use unreachable instead of llvm.trap, as implemented now win64: Call __chkstk through a register with the large code model Fixes half of PR18582. True dynamic allocas will still have a CALL64pcrel32 which will fail. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D7267 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227503 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 22:33:00 +00:00
Matt Arsenault	fa711758df	R600/SI: Implement enableAggressiveFMAFusion Add tests for the various combines. This should always be at least cycle neutral on all subtargets for f64, and faster on some. For f32 we should prefer selecting v_mad_f32 over v_fma_f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227484 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 19:34:32 +00:00
Colin LeMahieu	dec5091220	[Hexagon] Deleting old variants of intrinsics and adding missing tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227474 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 17:26:56 +00:00
Colin LeMahieu	f1b4917f1b	[Hexagon] Adding CR intrinsic tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227463 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 16:55:37 +00:00
Tom Stellard	51a3c27d6e	R600/SI: Define a schedule model and enable the generic machine scheduler The schedule model is not complete yet, and could be improved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227461 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 16:55:25 +00:00
Robert Lougher	1031549bec	[X86] Use single add/sub for large stack offsets For large stack offsets the compiler generates multiple immediate mode sub/add instructions in the prologue/epilogue. This patch makes the compiler place the final amount to be added/subtracted into a register, which is then added/substracted with a single operation. Differential Revision: http://reviews.llvm.org/D7226 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227458 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 16:18:29 +00:00
Colin LeMahieu	c7260e2ffa	[Hexagon] Adding XTYPE/PRED intrinsic tests. Converting predicate types to i32 instead of i1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227457 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 16:08:43 +00:00
Bill Schmidt	a5ea0b50a4	[PowerPC] Complete setting the baseline for ppc64le Patch by Nemanja Ivanovic. As was uncovered by the failing test case (when run on non-PPC platforms), the feature set when compiling with -march=ppc64le was not being picked up. This change ensures that if the -mcpu option is not specified, the correct feature set is picked up regardless of whether we are on PPC or not. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227455 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 15:59:09 +00:00
Alex Rosenberg	1c31f0df49	Make the test actually test what it's supposed to test. Add a test for the from memory variant of vcvtph2ps for 256-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227446 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 15:19:54 +00:00
Alex Rosenberg	27f7a0622c	Cleanup a few tests on sse4a machines and FileCheckize along the way. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227437 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 13:31:32 +00:00
Rafael Espindola	6cc1119408	Don't create multiple mergeable sections with -fdata-sections. ELF has support for sections that can be split into fixed size or null terminated entities. Since these sections can be split by the linker, it is not necessary to split them in codegen. This reduces the combined .o size in a llvm+clang build from 202,394,570 to 173,819,098 bytes. The time for linking clang with gold (on a VM, on a laptop) goes from 2.250089985 to 1.383001792 seconds. The flip side is the size of rodata in clang goes from 10,926,785 to 10,929,345 bytes. The increase seems to be because of http://sourceware.org/bugzilla/show_bug.cgi?id=17902. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227431 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 12:43:28 +00:00
Charlie Turner	1a8618cbbf	Add a missing Tag_DIV_use test for Cortex-M7. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227429 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 11:19:54 +00:00
Reid Kleckner	f77571aeac	Add a Windows EH preparation pass that zaps resumes If the personality is not a recognized MSVC personality function, this pass delegates to the dwarf EH preparation pass. This chaining supports people on -windows-itanium or -windows-gnu targets. Currently this recognizes some personalities used by MSVC and turns resume instructions into traps to avoid link errors. Even if cleanups are not used in the source program, LLVM requires the frontend to emit a code path that resumes unwinding after an exception. Clang does this, and we get unreachable resume instructions. PR20300 covers cleaning up these unreachable calls to resume. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D7216 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-29 00:41:44 +00:00
Colin LeMahieu	24373e35a4	[Hexagon] Updating several V5 intrinsics and adding FP tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227379 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 22:08:16 +00:00
Colin LeMahieu	a5062b38a9	[Hexagon] Adding XTYPE/MPY intrinsic tests and some missing multiply instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227347 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 19:16:17 +00:00
Colin LeMahieu	4925c39604	[Hexagon] Deleting a lot of old variants of intrinsics and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227338 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 18:29:11 +00:00
Colin LeMahieu	1135b0df02	[Hexagon] Converting XTYPE/BIT intrinsic patterns and adding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227335 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 18:06:23 +00:00
Colin LeMahieu	a5070baf7e	[Hexagon] Replacing XTYPE/SHIFT intrinsic patternss. Adding tests and missing instructions with tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227330 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 17:37:59 +00:00
Colin LeMahieu	0f5f7c0c1b	[Hexagon] Replacing old intrinsic tests with organized versions that match the reference manual. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227321 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 16:58:05 +00:00
Tom Stellard	ff340f98e3	R600: Move DataLayout to AMDGPUTargetMachine This is a follow up to r227113. It is now required to use the amdgcn target for SI and newer GPUs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227316 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 16:04:26 +00:00
Michael Kuperstein	0906c8fc1c	[X86] Reduce some 32-bit imuls into lea + shl Reduce integer multiplication by a constant of the form k*2^c, where k is in {3,5,9} into a lea + shl. Previously it was only done for imulq on 64-bit platforms, but it makes sense for imull and 32-bit as well. Differential Revision: http://reviews.llvm.org/D7196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227308 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 14:08:22 +00:00
Michael Kuperstein	e5b95695ea	[x32] Enable sibcall optimization on x32. This includes two things: 1) Fix TCRETURNdi and TCRETURN64di patterns to check the right thing (LP64 as opposed to target bitness). 2) Allow LEA64_32 in MatchingStackOffset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227307 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 13:38:48 +00:00
Elena Demikhovsky	b9d3801cd2	AVX-512: Added FMA intrinsics with rounding mode By Asaf Badouh and Elena Demikhovsky Added special nodes for rounding: FMADD_RND, FMSUB_RND.. It will prevent merge between nodes with rounding and other standard nodes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227303 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-28 10:21:27 +00:00
Quentin Colombet	24508d33fb	Revert r227242 - Merge vector stores into wider vector stores (PR21711). This commit creates infinite loop in DAG combine for in the LLVM test-suite for aarch64 with mcpu=cylcone (just having neon may be enough to expose this). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227272 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 23:58:01 +00:00
Alexey Samsonov	00b7a940e7	Revert "[x86] Combine x86mmx/i64 to v2i64 conversion to use scalar_to_vector" This reverts commits r226953 and r226974. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227248 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 21:34:11 +00:00
Sanjay Patel	c94f9d3d2f	Merge vector stores into wider vector stores (PR21711) This patch resolves part of PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ). The 'f3' test case in that report presents a situation where we have two 128-bit stores extracted from a 256-bit source vector. Instead of producing this: vmovaps %xmm0, (%rdi) vextractf128 $1, %ymm0, 16(%rdi) This patch merges the 128-bit stores into a single 256-bit store: vmovups %ymm0, (%rdi) Differential Revision: http://reviews.llvm.org/D7208 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227242 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 20:50:27 +00:00
Ramkumar Ramachandra	e67f5de1f7	overloaded-intrinsic-name: exercise anyptr on struct No other test I know shows how struct names are mangled in overloaded intrinsic functions. Differential Revision: http://reviews.llvm.org/D7037 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227229 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 20:03:08 +00:00
Marek Olsak	fd55bcd060	R600/SI: Enable all tests that pass on VI without changes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227214 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-27 17:27:15 +00:00
Simon Pilgrim	44513da617	[X86][SSE] Float comparisons can sometimes be safely commuted For ordered, unordered, equal and not-equal tests, packed float and double comparison instructions can be safely commuted without affecting the results. This patch checks the comparison mode of the (v)cmpps + (v)cmppd instructions and commutes the result if it can. Differential Revision: http://reviews.llvm.org/D7178 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227145 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 22:29:24 +00:00
Simon Pilgrim	3ba85ab23a	[X86][PCLMUL] Enable commutation for PCLMUL instructions Patch to allow (v)pclmulqdq to be commuted - swaps the src registers and inverts the immediate (low/high) src mask. Differential Revision: http://reviews.llvm.org/D7180 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227141 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 22:00:18 +00:00
Simon Pilgrim	38c35f3e2c	Line endings fix. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227138 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 21:28:32 +00:00
Matt Arsenault	b33118d503	R600: Cleanup or test Fix broken check lines, use multiple check prefixes, add an additional test for i1 or. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227137 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 21:16:10 +00:00
Simon Pilgrim	1d34ec14e9	Line endings fix. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227136 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 21:15:42 +00:00
Bruno Cardoso Lopes	0f3d4650f7	[x86][MMX] Rename and cleanup tests: arith, intrinsics and shuffle - Rename mmx-builtins to mmx-intrinsics to match other intrinsic test naming. - Remove tests that duplicate functionality from mmx-intrinsics.ll. - Move arith related tests to mmx-arith.ll. - MMX related shuffle goes to vector-shuffle-mmx.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227130 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 20:06:51 +00:00
Justin Holewinski	33eac3ee53	[NVPTX] Generate a more optimal sequence for select of i1 Instead of creating a pattern like "(p && a) \|\| ((!p) && b)", just expand the i8 operands to i32 and perform the selp on them. Fixes PR22246 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227123 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:52:20 +00:00
Justin Holewinski	48e872230d	[NVPTX] Handle floating-point conversion patterns that are not explicitly ordered or unordered Fixes PR22322 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:11:20 +00:00
Alex Rosenberg	8da9a6686a	Use a different encoding for debugtrap on PS4. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227116 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:09:27 +00:00
Sanjay Patel	956d6f0cf5	Model sqrtsd as a binary operation with one source operand tied to the destination (PR14221) This patch fixes the following miscompile: define void @sqrtsd(<2 x double> %a) nounwind uwtable ssp { %0 = tail call <2 x double> @llvm.x86.sse2.sqrt.sd(<2 x double> %a) nounwind %a0 = extractelement <2 x double> %0, i32 0 %conv = fptrunc double %a0 to float %a1 = extractelement <2 x double> %0, i32 1 %conv3 = fptrunc double %a1 to float tail call void @callee2(float %conv, float %conv3) nounwind ret void } Current codegen: sqrtsd %xmm0, %xmm1 ## high element of %xmm1 is undef here xorps %xmm0, %xmm0 cvtsd2ss %xmm1, %xmm0 shufpd $1, %xmm1, %xmm1 cvtsd2ss %xmm1, %xmm1 ## operating on undef value jmp _callee This is a continuation of http://llvm.org/viewvc/llvm-project?view=revision&revision=224624 ( http://reviews.llvm.org/D6330 ) which was itself a continuation of r167064 ( http://llvm.org/viewvc/llvm-project?view=revision&revision=167064 ). All of these patches are partial fixes for PR14221 ( http://llvm.org/bugs/show_bug.cgi?id=14221 ); this should be the final patch needed to resolve that bug. Differential Revision: http://reviews.llvm.org/D6885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227111 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 18:42:16 +00:00
Eric Christopher	fcd3c4065d	Move the Mips target to storing the ABI in the TargetMachine rather than on MipsSubtargetInfo. This required a bit of massaging in the MC level to handle this since MC is a) largely a collection of disparate classes with no hierarchy, and b) there's no overarching equivalent to the TargetMachine, instead only the subtarget via MCSubtargetInfo (which is the base class of TargetSubtargetInfo). We're now storing the ABI in both the TargetMachine level and in the MC level because the AsmParser and the TargetStreamer both need to know what ABI we have to parse assembly and emit objects. The target streamer has a pointer to the one in the asm parser and is updated when the asm parser is created. This is fragile as the FIXME comment notes, but shouldn't be a problem in practice since we always create an asm parser before attempting to emit object code via the assembler. The TargetMachine now contains the ABI so that the DataLayout can be constructed dependent upon ABI. All testcases have been updated to use the -target-abi command line flag so that we can set the ABI without using a subtarget feature. Should be no change visible externally here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227102 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 17:33:46 +00:00
Sanjay Patel	dbc6dda771	fix line-endings; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227095 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 17:21:36 +00:00
Vasileios Kalintiris	536bce219d	[mips] Enable arithmetic and binary operations for the i128 data type. Summary: This patch adds support for some operations that were missing from 128-bit integer types (add/sub/mul/sdiv/udiv... etc.). With these changes we can support the __int128_t and __uint128_t data types from C/C++. Depends on D7125 Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7143 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227089 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 12:33:22 +00:00
Vasileios Kalintiris	71ec66e7fd	[mips] Add tests for bitwise binary and integer arithmetic operators. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D7125 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227087 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 12:04:40 +00:00
Vasileios Kalintiris	823e8548a0	Revert "[mips] Fix assertion on i128 addition/subtraction on MIPS64" This reverts commit r227003. Support for addition/subtraction and various other operations for the i128 data type will be added in a future commit based on the review D7143. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227082 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 09:53:30 +00:00
Craig Topper	1656cc8ddb	[X86] Change comparision immediate type to i8 in test cases for AVX512 floating point comparisons. The type was already changed in the definitions and was being auto upgraded to the new type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227064 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-25 23:26:12 +00:00
Craig Topper	fd176682b9	[X86] Use i8 immediate for comparison type on AVX512 packed integer instructions. This matches floating point equivalents. Includes autoupgrade support to convert old code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227063 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-25 23:26:02 +00:00
Bill Schmidt	9dbb6a4f63	[PowerPC] Revert ppc64le-aggregates.ll test changes from r227053 It appears we have different behavior with and without -mcpu=pwr8 even with ppc64le defaulting to POWER8. The failure appears as follows: /home/bb/cmake-llvm-x86_64-linux/llvm-project/llvm/test/CodeGen/PowerPC/ppc64le-aggregates.ll:268:14: error: expected string not found in input ; CHECK-DAG: lfs 1, 0([[REG]]) ^ <stdin>:497:11: note: scanning from here ld 3, .LC1@toc@l(3) ^ <stdin>:497:11: note: with variable "REG" equal to "3" ld 3, .LC1@toc@l(3) ^ <stdin>:514:2: note: possible intended match here lfs 1, 0(4) ^ Reverting this particular test case change. Nemanja, please have a look at the reason for the failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227055 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-25 18:18:54 +00:00
Bill Schmidt	c536eed4d8	[PowerPC] Reset the baseline for ppc64le to be equivalent to pwr8 Test by Nemanja Ivanovic. Since ppc64le implies POWER8 as a minimum, it makes sense that the same features are included. Since the pwr8 processor model will likely be getting new features until the implementation is complete, I created a new list to add these updates to. This will include them in both pwr8 and ppc64le. Furthermore, it seems that it would make sense to compose the feature lists for other processor models (pwr3 and up). Per discussion in the review, I will make this change in a subsequent patch. In order to test the changes, I've added an additional run step to test cases that specify -march=ppc64le -mcpu=pwr8 to omit the -mcpu option. Since the feature lists are the same, the behaviour should be unchanged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227053 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-25 18:05:42 +00:00
Elena Demikhovsky	717d41d8c3	AVX-512: Changes in operations on masks registers for KNL and SKX - Added KSHIFTB/D/Q for skx - Added KORTESTB/D/Q for skx - Fixed store operation for v8i1 type for KNL - Store size of v8i1, v4i1 and v2i1 are changed to 8 bits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227043 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-25 12:47:15 +00:00
Alexei Starovoitov	fb490166f4	bpf: add missing lit.local.cfg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227009 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 18:20:52 +00:00
Alexei Starovoitov	4fe85c7548	BPF backend Summary: V8->V9: - cleanup tests V7->V8: - addressed feedback from David: - switched to range-based 'for' loops - fixed formatting of tests V6->V7: - rebased and adjusted AsmPrinter args - CamelCased .td, fixed formatting, cleaned up names, removed unused patterns - diffstat: 3 files changed, 203 insertions(+), 227 deletions(-) V5->V6: - addressed feedback from Chandler: - reinstated full verbose standard banner in all files - fixed variables that were not in CamelCase - fixed names of #ifdef in header files - removed redundant braces in if/else chains with single statements - fixed comments - removed trailing empty line - dropped debug annotations from tests - diffstat of these changes: 46 files changed, 456 insertions(+), 469 deletions(-) V4->V5: - fix setLoadExtAction() interface - clang-formated all where it made sense V3->V4: - added CODE_OWNERS entry for BPF backend V2->V3: - fix metadata in tests V1->V2: - addressed feedback from Tom and Matt - removed top level change to configure (now everything via 'experimental-backend') - reworked error reporting via DiagnosticInfo (similar to R600) - added few more tests - added cmake build - added Triple::bpf - tested on linux and darwin V1 cover letter: --------------------- recently linux gained "universal in-kernel virtual machine" which is called eBPF or extended BPF. The name comes from "Berkeley Packet Filter", since new instruction set is based on it. This patch adds a new backend that emits extended BPF instruction set. The concept and development are covered by the following articles: http://lwn.net/Articles/599755/ http://lwn.net/Articles/575531/ http://lwn.net/Articles/603983/ http://lwn.net/Articles/606089/ http://lwn.net/Articles/612878/ One of use cases: dtrace/systemtap alternative. bpf syscall manpage: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b4fc1a460f3017e958e6a8ea560ea0afd91bf6fe instruction set description and differences vs classic BPF: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt Short summary of instruction set: - 64-bit registers R0 - return value from in-kernel function, and exit value for BPF program R1 - R5 - arguments from BPF program to in-kernel function R6 - R9 - callee saved registers that in-kernel function will preserve R10 - read-only frame pointer to access stack - two-operand instructions like +, -, *, mov, load/store - implicit prologue/epilogue (invisible stack pointer) - no floating point, no simd Short history of extended BPF in kernel: interpreter in 3.15, x64 JIT in 3.16, arm64 JIT, verifier, bpf syscall in 3.18, more to come in the future. It's a very small and simple backend. There is no support for global variables, arbitrary function calls, floating point, varargs, exceptions, indirect jumps, arbitrary pointer arithmetic, alloca, etc. From C front-end point of view it's very restricted. It's done on purpose, since kernel rejects all programs that it cannot prove safe. It rejects programs with loops and with memory accesses via arbitrary pointers. When kernel accepts the program it is guaranteed that program will terminate and will not crash the kernel. This patch implements all 'must have' bits. There are several things on TODO list, so this is not the end of development. Most of the code is a boiler plate code, copy-pasted from other backends. Only odd things are lack or < and <= instructions, specialized load_byte intrinsics and 'compare and goto' as single instruction. Current instruction set is fixed, but more instructions can be added in the future. Signed-off-by: Alexei Starovoitov <alexei.starovoitov@gmail.com> Subscribers: majnemer, chandlerc, echristo, joerg, pete, rengolin, kristof.beyls, arsenm, t.p.northover, tstellarAMD, aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D6494 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227008 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 17:51:26 +00:00
Daniel Sanders	f945a40f9d	[mips] Fix assertion on i128 addition/subtraction on MIPS64 Summary: In addition to the included tests, this fixes test/CodeGen/Generic/i128-addsub.ll on a mips64 host. Reviewers: atanasyan, sagar, vmedic Reviewed By: vmedic Subscribers: sdkie, llvm-commits Differential Revision: http://reviews.llvm.org/D6610 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227003 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 12:58:10 +00:00
Andrea Di Biagio	b981dc4a21	[DAG] Fix wrong canonicalization performed on shuffle nodes. This fixes a regression introduced by r226816. When replacing a splat shuffle node with a constant build_vector, make sure that the new build_vector has a valid number of elements. Thanks to Patrik Hagglund for reporting this problem and providing a small reproducible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227002 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 11:54:29 +00:00
Quentin Colombet	af1cd03764	[AArch64][LoadStoreOptimizer] Form LDPSW when possible. This patch adds the missing LD[U]RSW variants to the load store optimizer, so that we generate LDPSW when possible. <rdar://problem/19583480> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226978 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-24 01:25:54 +00:00
Tom Stellard	5b37a2e5ff	R600/SI: Emit .hsa.version section for amdhsa OS git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226970 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 23:59:08 +00:00
Reid Kleckner	339591e0a9	Fix assertion when C++ EH filters are present in functions using SEH Should fix PR22305. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226969 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 23:51:25 +00:00
Bruno Cardoso Lopes	807360ab08	[x86] Combine x86mmx/i64 to v2i64 conversion to use scalar_to_vector Handle the poor codegen for i64/x86xmm->v2i64 (%mm -> %xmm) moves. Instead of using stack store/load pair to do the job, use scalar_to_vector directly, which in the MMX case can use movq2dq. This was the current behavior prior to improvements for vector legalization of extloads in r213897. This commit fixes the regression and as a side-effect also remove some unnecessary shuffles. In the new attached testcase, we go from: pshufw $-18, (%rdi), %mm0 movq %mm0, -8(%rsp) movq -8(%rsp), %xmm0 pshufd $-44, %xmm0, %xmm0 movd %xmm0, %eax ... To: pshufw $-18, (%rdi), %mm0 movq2dq %mm0, %xmm0 movd %xmm0, %eax ... Differential Revision: http://reviews.llvm.org/D7126 rdar://problem/19413324 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226953 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 22:44:16 +00:00
Tom Stellard	511a3c71fc	R600/SI: Move i64 -> v2i32 load promotion into AMDGPUDAGToDAGISel::Select() We used to do this promotion during DAG legalization, but this caused an infinite loop in ExpandUnalignedLoad() because it assumed that i64 loads were legal if i64 was a legal type. It also seems better to report i64 loads as legal, since they actually are and we were just promoting them to simplify our tablegen files. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226945 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 22:05:45 +00:00
Reid Kleckner	26ba4c13a7	Classify functions by EH personality type rather than using the triple This mostly reverts commit r222062 and replaces it with a new enum. At some point this enum will grow at least for other MSVC EH personalities. Also beefs up the way we were sniffing the personality function. Previously we would emit the Itanium LSDA despite using __C_specific_handler. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6987 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226920 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 18:49:01 +00:00
Jyoti Allur	245caec9b3	This patch fixes issue with lowering below mentioned pattern :- _foo: smull r0, r1, r1, r0 smull r2, r3, r3, r2 adds r0, r2, r0 adc r1, r3, r1 bx lr to _foo: smull r0, r1, r1, r0 smlal r0, r1, r3, r2 bx lr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226904 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 09:10:03 +00:00
Craig Topper	d05a6aa4e6	[x86] Change u8imm operands to always print as unsigned. This makes shuffle masks and the like make way more sense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226902 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-23 08:00:59 +00:00
Jan Vesely	1d07592ec7	R600: Try to use lower types for 64bit division if possible v2: add and enable tests for SI Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Matt Arsenault <Matthew.Arsenault@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226881 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 23:42:43 +00:00
Simon Pilgrim	316b43f7df	[X86][AVX] Added (V)MOVDDUP / (V)MOVSLDUP / (V)MOVSHDUP memory folding + tests. Minor tweak now that D7042 is complete, we can enable stack folding for (V)MOVDDUP and do proper testing. Added missing AVX ymm folding patterns and fixed alignment for AVX VMOVSLDUP / VMOVSHDUP. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226873 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 22:39:59 +00:00
Simon Pilgrim	6377361399	Line endings fixes. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226872 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 22:27:37 +00:00
Simon Pilgrim	c7d6e9b0f9	[X86][SSE] Simplified PSUBUS tests Removed loops from PSUBUS tests - ensures folding is tested. Also renamed SSE2 tests SSSE3 to match cpu. This is a follow up commit agreed in http://reviews.llvm.org/D7094 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226871 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 22:19:58 +00:00
Ramkumar Ramachandra	230796b278	Intrinsics: introduce llvm_any_ty aka ValueType Any Specifically, gc.result benefits from this greatly. Instead of: gc.result.int.* gc.result.float.* gc.result.ptr.* ... We now have a gc.result.* that can specialize to literally any type. Differential Revision: http://reviews.llvm.org/D7020 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226857 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 20:14:38 +00:00
Sanjay Patel	05d5e213c4	merge consecutive stores of extracted vector elements (PR21711) This is a 2nd try at the same optimization as http://reviews.llvm.org/D6698. That patch was checked in at r224611, but reverted at r225031 because it caused a failure outside of the regression tests. The cause of the crash was not recognizing consecutive stores that have mixed source values (loads and vector element extracts), so this patch adds a check to bail out if any store value is not coming from a vector element extract. This patch also refactors the shared logic of the constant source and vector extracted elements source cases into a helper function. Differential Revision: http://reviews.llvm.org/D6850 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226845 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 18:21:26 +00:00
Michael Kuperstein	a52ddfa930	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 Fixed recommit of r226811. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226816 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 13:07:28 +00:00
Michael Kuperstein	8fc1c3a619	Revert r226811, MSVC accepts code sane compilers don't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226814 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 12:48:07 +00:00
Michael Kuperstein	0a979a09ae	[DAGCombine] Produce better code for constant splats This solves PR22276. Splats of constants would sometimes produce redundant shuffles, sometimes ridiculously so (see the PR for details). Fold these shuffles into BUILD_VECTORs early on instead. Differential Revision: http://reviews.llvm.org/D7093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226811 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 12:37:23 +00:00
Elena Demikhovsky	2785766bc8	Fixed a bug in type legalizer for masked load/store intrinsics. The problem occurs when after vectorization we have type <2 x i32>. This type is promoted to <2 x i64> and then requires additional efforts for expanding loads and truncating stores. I added EXPAND / TRUNCATE attributes to the masked load/store SDNodes. The code now contains additional shuffles. I've prepared changes in the cost estimation for masked memory operations, it will be submitted separately. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226808 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 12:07:59 +00:00
Elena Demikhovsky	cdce03426d	Fixed a bug in narrowing store operation. Type MVT::i1 became legal in KNL, but store operation can't be narrowed to this type, since the size of VT (1 bit) is not equal to its actual store size(8 bits). Added a test provided by David (dag@cray.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226805 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 09:39:08 +00:00
Reid Kleckner	73f671a60e	SEH: Finish writing the catch-all test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226768 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 02:31:09 +00:00
Reid Kleckner	0d056fd4c3	Win64 SEH: Emit the constant 1 for catch-all into xdata git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226767 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-22 02:27:44 +00:00
Simon Pilgrim	3f6acdd265	[X86][SSE] Missing SSE/AVX1 memory folding integer instructions Added most of the missing integer vector folding patterns for SSE (to SSE42) and AVX1. The most useful of these are probably the i32/i64 extraction, i8/i16/i32/i64 insertions, zero/sign extension, unsigned saturation subtractions, i64 subtractions and the variable mask blends (pblendvb) - others include CLMUL, SSE42 string comparisons and bit tests. Differential Revision: http://reviews.llvm.org/D7094 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226745 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 23:43:30 +00:00
Tim Northover	f5f8a3e6a6	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) It can help with argument juggling on some targets, and is generally a good idea. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226740 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 23:17:19 +00:00
Matt Arsenault	85661f76e3	R600: Add checks for urem/srem by a constant Make sure this uses the faster expansion using magic constants to avoid the full division path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226734 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:56:15 +00:00
Simon Pilgrim	4269590166	[X86][SSE] Added support for SSE3 lane duplication shuffle instructions This patch adds shuffle matching for the SSE3 MOVDDUP, MOVSLDUP and MOVSHDUP instructions. The big use of these being that they avoid many single source shuffles from needing to use (pre-AVX) dual source instructions such as SHUFPD/SHUFPS: causing extra moves and preventing load folds. Adding these instructions uncovered an issue in XFormVExtractWithShuffleIntoLoad which crashed on single operand shuffle instructions (now fixed). It also involved fixing getTargetShuffleMask to correctly identify theses instructions as unary shuffles. Also adds a missing tablegen pattern for MOVDDUP. Differential Revision: http://reviews.llvm.org/D7042 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226716 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:44:35 +00:00
Matt Arsenault	50c3bc9956	R600: Add missing tests for i64 srem git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226713 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:43:19 +00:00
Jonathan Roelofs	cab5680f6c	Fix load-store optimizer on thumbv4t Thumbv4t does not have lo->lo copies other than MOVS, and that can't be predicated. So emit MOVS when needed and bail if there's a predicate. http://reviews.llvm.org/D6592 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226711 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 22:39:43 +00:00
Matt Arsenault	305228cc0b	R600/SI: Custom lower fround This fixes it for SI. It also removes the pattern used previously for Evergreen for f32. I'm not sure if the the new R600 output is better or not, but it uses 1 fewer instructions if BFI is available. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226682 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 18:18:25 +00:00
Colin LeMahieu	62b9c33e13	[Hexagon] Converting multiply and accumulate with immediate intrinsics to patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226681 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 18:13:15 +00:00
Ahmed Bougacha	34288d885e	[X86] Declare SSE4.1/AVX2 vector extloads covered by PMOV[SZ]X legal. Now that we can fully specify extload legality, we can declare them legal for the PMOVSX/PMOVZX instructions. This for instance enables a DAGCombine to fire on code such as (and (<zextload-equivalent> ...), <redundant mask>) to turn it into: (zextload ...) as seen in the testcase changes. There is one regression, in widen_load-2.ll: we're no longer able to do store-to-load forwarding with illegal extload memory types. This will be addressed separately. Differential Revision: http://reviews.llvm.org/D6533 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226676 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 17:07:06 +00:00
Tim Northover	c49e57ade1	Revert "DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N))" It hadn't gone through review yet, but was still on my local copy. This reverts commit r226663 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226665 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 15:48:52 +00:00
Tim Northover	004d725549	AArch64: add backend option to reserve x18 (platform register) AAPCS64 says that it's up to the platform to specify whether x18 is reserved, and a first step on that way is to add a flag controlling it. From: Andrew Turner <andrew@fubar.geek.nz> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226664 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 15:43:31 +00:00
Tim Northover	47f47f5d2a	DAGCombine: fold (or (and X, M), (and X, N)) -> (and X, (or M, N)) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226663 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 15:43:28 +00:00
Michael Kuperstein	0b4244ade1	[x32] Fast ISel should use LEA64_32r instead of LEA32r to adjust addresses in x32 mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226661 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 14:44:05 +00:00
Simon Pilgrim	650d4f00ae	[X86][AVX] Simplified diff between AVX1 and SSE42 fp stack folding tests. NFC. Changed the AVX1 tests register spill tail call to return a xmm like the SSE42 version - makes doing diffs between them a lot easier without affecting the spills themselves. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226623 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-21 00:02:13 +00:00
Simon Pilgrim	8608e5bbc7	[X86][SSE] Added SSE/AVX1 integer stack folding tests. Some folding patterns + tests are missing (marked as TODO) - these will be added in a future patch for review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226622 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 23:54:17 +00:00
Simon Pilgrim	32f0438ada	[X86][SSE] Added SSE fp stack folding tests. Some folding patterns + tests are missing (marked as TODO) - these will be added in a future patch for review. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226621 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 23:50:18 +00:00
Simon Pilgrim	bddfe2660e	[X86][AVX] Renamed AVX1 fp stack folding tests. NFC. The SSE42 version of the AVX1 float stack folding tests will be added shortly, this renames the AVX1 file so that the files will be near each other in a directory listing to help ensure they are kept in sync. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226620 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 23:45:50 +00:00
Colin LeMahieu	0d9733d596	[Hexagon] Adding intrinsics for doubleword ALU operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226606 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 20:45:05 +00:00
Colin LeMahieu	50e4abc0ee	[Hexagon] Removing unnecessary clutter in intrinsic tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226602 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 19:46:07 +00:00
Daniel Jasper	529ff2f257	Prevent binary-tree deterioration in sparse switch statements. This addresses part of llvm.org/PR22262. Specifically, it prevents considering the densities of sub-ranges that have fewer than TLI.getMinimumJumpTableEntries() elements. Those densities won't help jump tables. This is not a complete solution but works around the most pressing issue. Review: http://reviews.llvm.org/D7070 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226600 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 19:43:33 +00:00
Ramkumar Ramachandra	b8ae0acfaf	[GC] Verify-pass void vararg functions in gc.statepoint With the appropriate Verifier changes, exactracting the result out of a statepoint wrapping a vararg function crashes. However, a void vararg function works fine: commit this first step. Differential Revision: http://reviews.llvm.org/D7071 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226599 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 19:42:46 +00:00
Tom Stellard	5d96beaab5	R600/SI: Fix simple-loop.ll test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226596 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 19:33:02 +00:00
Tom Stellard	ad7a884efe	R600/SI: Add kill flag when copying scratch offset to a register This allows us to re-use the same register for the scratch offset when accessing large private arrays. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226585 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 17:49:45 +00:00
Tom Stellard	a978a481bb	R600/SI: Don't store scratch buffer frame index in MUBUF offset field We don't have a good way of legalizing this if the frame index offset is more than the 12-bits, which is size of MUBUF's offset field, so now we store the frame index in the vaddr field. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226584 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 17:49:43 +00:00
Kai Nacke	fa4d8baf54	[mips] Add registers and ALL check prefix to octeon test case. No functional change. Reviewed by D. Sanders git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226574 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 16:14:02 +00:00
Kai Nacke	57e80129b9	[mips] Add octeon branch instructions bbit0/bbit032/bbit1/bbit132 This commits adds the octeon branch instructions bbit0/bbit032/bbit1/bbit132. It also includes patterns for instruction selection and test cases. Reviewed by D. Sanders git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226573 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-20 16:10:51 +00:00
Simon Pilgrim	2a2f94c5f2	[X86][AVX] Missing AVX1 memory folding float instructions Now that we can create much more exhaustive X86 memory folding tests, this patch adds the missing AVX1/F16C floating point instruction stack foldings we can easily test for including the scalar intrinsics (add, div, max, min, mul, sub), conversions float/int to double, half precision conversions, rounding, dot product and bit test. The patch also adds a couple of obviously missing SSE instructions (more to follow once we have full SSE testing). Now that scalar folding is working it broke a very old test (2006-10-07-ScalarSSEMiscompile.ll) - this test appears to make no sense as its trying to ensure that a scalar subtraction isn't folded as it 'would zero the top elts of the loaded vector' - this test just appears to be wrong to me. Differential Revision: http://reviews.llvm.org/D7055 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226513 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 22:40:45 +00:00
Colin LeMahieu	dc8beeba1b	[Hexagon] Updating muxir/ri/ii intrinsics. Setting predicate registers as compatible with i32 rather than doing custom type conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226500 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 20:31:18 +00:00
Colin LeMahieu	3bea6a4959	[Hexagon] Converting intrinsics combine imm/imm, simple shifts and extends. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226483 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 18:56:19 +00:00
Colin LeMahieu	596cfabbc4	[Hexagon] Converting remaining ALU32/ALU intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226480 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 18:33:58 +00:00
Colin LeMahieu	254d992ab8	[Hexagon] Converting ALU32/ALU intrinsics to new patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226478 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 18:22:19 +00:00
Greg Fitzgerald	4b96323e81	[AArch64] Implement GHC calling convention Original patch by Luke Iannini. Minor improvements and test added by Erik de Castro Lopo. Differential Revision: http://reviews.llvm.org/D6877 From: Erik de Castro Lopo <erikd@mega-nerd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226473 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 17:40:05 +00:00
Colin LeMahieu	438f1e4979	[Hexagon] Converting halfword to double accumulating multiply intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226472 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 17:36:32 +00:00
Rafael Espindola	4b678bff4e	Bring r226038 back. No change in this commit, but clang was changed to also produce trivial comdats when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 15:16:06 +00:00
Michael Kuperstein	5a0c8601d3	[MIScheduler] Slightly better handling of constrainLocalCopy when both source and dest are local This fixes PR21792. Differential Revision: http://reviews.llvm.org/D6823 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226433 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 07:30:47 +00:00
Hal Finkel	d1f1656447	[PowerPC] Add r2 as an operand for all calls under both PPC64 ELF V1 and V2 Our PPC64 ELF V2 call lowering logic added r2 as an operand to all direct call instructions in order to represent the dependency on the TOC base pointer value. Restricting this to ELF V2, however, does not seem to make sense: calls under ELF V1 have the same dependence, and indirect calls have an r2 dependence just as direct ones. Make sure the dependence is noted for all calls under both ELF V1 and ELF V2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226432 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-19 07:20:27 +00:00
Matt Arsenault	676db0a373	R600: Remove redundant test This is already covered in ftrunc.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226412 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 19:30:32 +00:00
Simon Pilgrim	a707cf4652	[X86][SSE] Added scalar min/max folding tests. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226406 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 18:06:23 +00:00
Simon Pilgrim	7c62013afe	[X86][SSE] Added float extract and xmm extract/insert stack folding tests. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 17:04:32 +00:00
Simon Pilgrim	2793dd7a29	[X86][SSE] Added scalar conversion stack folding tests. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226404 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 16:22:15 +00:00
Simon Pilgrim	65a3a9bea3	AVX1 stack folding tests. NFC. Begun adding more exhaustive tests - all floating point instructions should now be either tested or have placeholders. We do seem to have a number of missing instructions, I will add a patch for review once the remaining working instructions are added. I'll then move on to SSE tests and then the integer instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226400 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 12:56:39 +00:00
Hal Finkel	a01b583dbc	[PowerPC] Initial PPC64 calling-convention changes for fastcc The default calling convention specified by the PPC64 ELF (V1 and V2) ABI is designed to work with both prototyped and non-prototyped/varargs functions. As a result, GPRs and stack space are allocated for every argument, even those that are passed in floating-point or vector registers. GlobalOpt::OptimizeFunctions will transform local non-varargs functions (that do not have their address taken) to use the 'fast' calling convention. When functions are using the 'fast' calling convention, don't allocate GPRs for arguments passed in other types of registers, and don't allocate stack space for arguments passed in registers. Other changes for the fast calling convention may be added in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226399 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-18 12:08:47 +00:00
Hal Finkel	962cff0c08	[PowerPC] Don't list R11 as a patchpoint scratch register R11's status is the same under both the PPC64 ELF V1 and V2 ABIs: it is reserved for use as an "environment pointer" for compilation models that require such a thing. We don't, we also don't need a second scratch register, and because we support only "local" patchpoint call targets, we might as well let R11 be used for anyregcc patchpoints. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226369 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-17 03:57:34 +00:00
Mehdi Amini	5eed637b34	Improve DAG combine pass on certain IR vector patterns Loading 2 2x32-bit float vectors into the bottom half of a 256-bit vector produced suboptimal code in AVX2 mode with certain IR combinations. In particular, the IR optimizer folded 2f32 + 2f32 -> 4f32, 4f32 + 4f32 (undef) -> 8f32 into a 2f32 + 2f32 -> 8f32, which seems more canonical, but then mysteriously generated rather bad code; the movq/movhpd combination didn't match. The problem lay in the BUILD_VECTOR optimization path. The 2f32 inputs would get promoted to 4f32 by the type legalizer, eventually resulting in a BUILD_VECTOR on two 4f32 into an 8f32. The BUILD_VECTOR then, recognizing these were both half the output size, concatted them and then produced a shuffle. However, the resulting concat + shuffle was more complex than it should be; in the case where the upper half of the output is undef, we probably want to generate shuffle + concat instead. This enhancement causes the vector_shuffle combine step to recognize this suboptimal pattern and correct it. I included it there instead of in BUILD_VECTOR in case the same suboptimal pattern occurs for other reasons. This results in the optimizer correctly producing the optimal movq + movhpd sequence for all three variations on this IR, even with AVX2. I've included a test case. Radar link: rdar://problem/19287012 Fix for PR 21943. From: Fiona Glaser <fglaser@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226360 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-17 01:35:56 +00:00
Matt Arsenault	b2bb846f17	R600: Clean up floor tests These were using different naming schemes, not using multiple check prefixes and not using -LABEL. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226333 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 22:11:00 +00:00
Colin LeMahieu	d08a6cd083	[Hexagon] Converting halfword to doubleword multiply intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226326 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 21:41:57 +00:00
Colin LeMahieu	9a56f06825	[Hexagon] Converting accumulating halfword multiply intrinsics to patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226324 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 21:36:34 +00:00
Colin LeMahieu	67451fe320	[Hexagon] Beginning converting intrinsics to patterns instead of duplicated definitions. Converting halfword multiply intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226318 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 20:38:54 +00:00
Adam Nemet	ad2ac976af	[AVX512] Add intrinsics for masked aligned FP loads and stores Similar to the unaligned cases. Test was generated with update_llc_test_checks.py. Part of <rdar://problem/17688758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226296 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 18:50:09 +00:00
Adam Nemet	1310e4f3c7	[AVX512] Remove trailing whitespaces in this test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226295 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 18:50:07 +00:00
Andrea Di Biagio	ac7b9c828f	[X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0. This patch disables target specific combine on X86ISD::INSERTPS dag nodes if optlevel is CodeGenOpt::None. The backend currently implements a target specific combine rule that converts a vector load used by an INSERTPS dag node into a scalar load plus a scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of two instructions (i.e. a vector load plus INSERTPSrr). However, the existing target combine rule on INSERTPS nodes only works under the assumption that ISel will always be able to match an INSERTPSrm. This is not true in general at -O0, since the backend only allows folding a load into the memory operand of an instruction if the optimization level is not CodeGenOpt::None. In the example below: // __m128 test(__m128 a, __m128 b) { __m128 c = _mm_insert_ps(a, b, 1 << 6); return c; } // Before this patch, at -O0, the backend would have canonicalized the load to 'b' into a scalar load plus scalar_to_vector. Later on, ISel would have selected an INSERTPSrr leaving the insertps mask in an inconsistent state: movss 4(%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3]. With this patch, the backend avoids folding the vector load into the operand of the INSERTPS. The new codegen at -O0 is: movaps (%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3]. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 14:55:26 +00:00
Simon Pilgrim	3717f7c80c	[X86] Refactored stack memory folding tests to explicitly force register spilling The current 'big vectors' stack folded reload testing pattern is very bulky and makes it difficult to test all instructions as big vectors will tend to use only the ymm instruction implementations. This patch changes the tests to use a nop call that lists explicit xmm registers as sideeffects, with this we can force a partial register spill of the relevant registers and then check that the reload is correctly folded. The asm generated only adds the forced spill, a nop instruction and a couple of extra labels (a fraction of the current approach). More exhaustive tests will follow shortly, I've added some extra tests (the xmm versions of some of the existing folding tests) as a starting point. Differential Revision: http://reviews.llvm.org/D6932 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226264 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 09:32:54 +00:00
Timur Iskhodzhanov	6a7c74de33	Revert r226242 - Revert Revert Don't create new comdats in CodeGen This breaks AddressSanitizer (ninja check-asan) on Windows git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226251 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 08:38:45 +00:00
Hal Finkel	92cd0ca3b2	[PowerPC] Adjust PatchPoints for ppc64le Bill Schmidt pointed out that some adjustments would be needed to properly support powerpc64le (using the ELF V2 ABI). For one thing, R11 is not available as a scratch register, so we need to use R12. R12 is also available under ELF V1, so to maintain consistency, I flipped the order to make R12 the first scratch register in the array under both ABIs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226247 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 04:40:58 +00:00
Rafael Espindola	dfe88a08c7	Revert "Revert Don't create new comdats in CodeGen" This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226242 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 02:22:55 +00:00
Matt Arsenault	ab2315014e	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226230 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:58:35 +00:00
Matt Arsenault	c204f47feb	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226226 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:17:03 +00:00
Hal Finkel	94dc061e85	[PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect calls Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226207 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 21:17:34 +00:00
Colin LeMahieu	02b677594c	[Hexagon] Updating indexed load-extend patterns and changing test to new expected output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226206 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 21:07:52 +00:00
Hal Finkel	39b09ae788	Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226200 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 20:32:09 +00:00
Matt Arsenault	ecbec418bd	R600/SI: Improve fpext / fptrunc test coverage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 19:39:42 +00:00
Marek Olsak	232d5fa02c	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226190 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 18:43:01 +00:00
Ramkumar Ramachandra	4f158a708b	statepoint tests: use statepoint-example gc Mechanical conversion of statepoint tests to use the example-statepoint gc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226183 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 18:10:44 +00:00
Colin LeMahieu	044438aff5	[Hexagon] Deleting old float comparison instruction and updating references to new ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226179 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 17:28:14 +00:00
Colin LeMahieu	4ce3b1e4ce	[Hexagon] Replacing old fadd/fsub instructions and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226176 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 16:30:07 +00:00
Timur Iskhodzhanov	d048b3be70	Revert Don't create new comdats in CodeGen It breaks AddressSanitizer on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226173 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 16:14:34 +00:00
Daniel Sanders	cb71ef1b46	[mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6. Summary: The patterns intended for the SETLE node were actually matching the SETLT node. Reviewers: atanasyan, sstankovic, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6997 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 15:41:03 +00:00
Hal Finkel	f908e37144	Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers" Reverting this while I investigate some bad behavior this is causing. As a possibly-related issue, adding -verify-machineinstrs to one of the test cases now fails because of this change: llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs * Bad machine code: No instruction at def index * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS Valno #3 is defined at 624r * Bad machine code: Live segment doesn't end at a valid instruction * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS [624r,624d:3) LLVM ERROR: Found 2 machine code errors. where 624r corresponds exactly to the interval combining change: 624B %RSP<def> = COPY %vreg16; GR64:%vreg16 Considering merging %vreg16 with %RSP RHS = %vreg16 [608r,624r:0) 0@608r updated: 608B %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1] Success: %vreg16 -> %RSP Result = %RSP git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226086 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 03:08:59 +00:00
Hal Finkel	47ab8c106f	[RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226071 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:25:28 +00:00
Philip Reames	8f9d11309a	getMangledTypeStr: clarify how it mangles types, and add tests "Write a set of tests that show how name mangling is done for overloaded intrinsics." These happen to use gc.relocates to exercise the codepath in question, but is not a GC specific test. Patch by: artagnon@gmail.com Differential Revision: http://reviews.llvm.org/D6915 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226056 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 23:05:17 +00:00
Duncan P. N. Exon Smith	37ac8d3622	IR: Move MDLocation into place This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226048 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 22:27:36 +00:00
Rafael Espindola	33f5127540	Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:55:48 +00:00
Chandler Carruth	8af8091ef5	[MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement. Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to help measure the difference. A possible explanation. In diamond-shaped CFGs (A followed by either B or C both followed by D), putting B and C both in between A and D leads to the code being less dense than it could be. Always either B or C have to be skipped increasing the chance of cache misses etc. Moving either B or C to after D might be beneficial on average. In the long run, but we should probably do a better job of analyzing the basic block and branch probabilities to move the correct one of B or C to after D. But even if we don't use this in the long run, it is a good baseline for benchmarking. Original patch authored by Daniel Jasper with test tweaks and a second flag added by me. Differential Revision: http://reviews.llvm.org/D6969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226034 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:19:29 +00:00
Bill Schmidt	11abe69e98	[PPC64] Add support for the ICBT instruction on POWER8. Patch by Kit Barton. Support for the ICBT instruction is currently present, but limited to embedded processors. This change adds a new FeatureICBT that can be used to identify whether the ICBT instruction is available on a specific processor. Two new tests are added: * Positive test to ensure the icbt instruction is present when using -mcpu=pwr8 * Negative test to ensure the icbt instruction is not generated when using -mcpu=pwr7 Both test cases use the Prefetch opcode in LLVM. They are based on the ppc64-prefetch.ll test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226033 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:17:10 +00:00
Olivier Sallenave	735aa71398	Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225987 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 15:36:28 +00:00
Kai Nacke	92e28620d3	[mips] Refine octeon instructions seq/seqi/sne/snei This commit refines the pattern for the octeon seq/seqi/sne/snei instructions. The target register is set to 0 or 1 according to the result of the comparison. In C, this is something like rd = (unsigned long)(rs == rt) This commit adds a zext to bring the result to i64. With this change the instruction is selected for this type of code. (gcc produces the same code for the above C code.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225968 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 10:19:09 +00:00
Brad Smith	f449c53c89	Use the integrated assembler by default on SPARC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225957 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 07:53:39 +00:00
JF Bastien	7f0cbb5703	Revert "Insert random noops to increase security against ROP attacks (llvm)" This reverts commit: http://reviews.llvm.org/D3392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225948 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 05:24:33 +00:00
NAKAMURA Takumi	2a38522280	Disable a couple of tests, CodeGen/X86/noop-insert.ll and CodeGen/X86/noop-insert-percentage.ll, in r225908, to unbreak tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225940 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 04:21:33 +00:00
Tim Northover	09bec94c16	ARM: add test for crc32 instructions in CodeGen. Somehow we seem to have ended up without any actual tests of the CodeGen side. Easy enough to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225930 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:43:33 +00:00
Hal Finkel	037c21f82c	[PowerPC] Fix the noop-insert test The form of nops used is CPU-specific (some CPUs, such as the POWER7, have special group-terminating nops). We probably want a different callback for this kind of nop insertion (something more like MCAsmBackend::writeNopData), or for PPC to use a different mechanism for scheduling nops, but this will stop the test from failing for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225928 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:37:21 +00:00
Matt Arsenault	140c2ece1e	R600/SI: Remove some redudant load testcases. This reduces coverage for Evergreen, since the more complete tests have those run lines disabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225927 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:26 +00:00
Matt Arsenault	781f7ee502	R600/SI: Fix bad code with unaligned byte vector loads Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225926 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:22 +00:00
Matt Arsenault	8b6a26ca85	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225925 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:17 +00:00
Hal Finkel	ade705c6e5	Revert "r225811 - Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support"" This re-applies r225808, fixed to avoid problems with SDAG dependencies along with the preceding fix to ScheduleDAGSDNodes::RegDefIter::InitNodeNumDefs. These problems caused the original regression tests to assert/segfault on many (but not all) systems. Original commit message: This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225909 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:07:51 +00:00
JF Bastien	21befa7761	Insert random noops to increase security against ROP attacks (llvm) A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks. Command line options: -noop-insertion // Enable noop insertion. -noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion) -max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions. In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future). -fdiversify This is the llvm part of the patch. clang part: D3393 http://reviews.llvm.org/D3392 Patch by Stephen Crane (@rinon) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225908 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:07:26 +00:00
Reid Kleckner	504fa89c8e	CodeGen support for x86_64 SEH catch handlers in LLVM This adds handling for ExceptionHandling::MSVC, used by the x86_64-pc-windows-msvc triple. It assumes that filter functions have already been outlined in either the frontend or the backend. Filter functions are used in place of the landingpad catch clause type info operands. In catch clause order, the first filter to return true will catch the exception. The C specific handler table expects the landing pad to be split into one block per handler, but LLVM IR uses a single landing pad for all possible unwind actions. This patch papers over the mismatch by synthesizing single instruction BBs for every catch clause to fill in the EH selector that the landing pad block expects. Missing functionality: - Accessing data in the parent frame from outlined filters - Cleanups (from __finally) are unsupported, as they will require outlining and parent frame access - Filter clauses are unsupported, as there's no clear analogue in SEH In other words, this is the minimal set of changes needed to write IR to catch arbitrary exceptions and resume normal execution. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6300 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225904 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:05:27 +00:00
Adam Nemet	656da67bc0	[AVX512] Add 16x32 unpck tests as well Forgot this from r225838. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 23:27:55 +00:00
Adam Nemet	f38f71d8a0	Fix function names in tests from r225838. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225840 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 22:40:15 +00:00
Adam Nemet	293f71ddd2	[AVX512] Unpack support in new shuffle lowering This now handles both 32 and 64-bit element sizes. In this version, the test are in vector-shuffle-512-v8.ll, canonicalized by Chandler's update_llc_test_checks.py. Part of <rdar://problem/17688758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225838 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 22:20:18 +00:00
Matt Arsenault	8603a3d1c5	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225827 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 20:53:18 +00:00
Matt Arsenault	9e495c518c	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225822 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:46:48 +00:00
Ulrich Weigand	81d2500685	Use the integrated assembler as default on SystemZ This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases deliberately using an invalid instruction in inline asm now have to use -no-integrated-as. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225820 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:45:16 +00:00
Ulrich Weigand	5a4c26e7bc	Use the integrated assembler as default on PowerPC This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases using inline asm had to be adapted, either by updating the expected output, or by using -no-integrated-as (for such tests that deliberately use an invalid instruction in inline asm). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225819 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:43:45 +00:00
Hal Finkel	ea55eceaed	Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support" Reverting this while I investiage buildbot failures (segfaulting in GetCostForDef at ScheduleDAGRRList.cpp:314). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225811 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 18:25:05 +00:00
Hal Finkel	232f393466	[PowerPC] Add StackMap/PatchPoint support This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225808 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 17:48:12 +00:00
Jozef Kolek	abdc0284ff	[mips][microMIPS] Fix issue with 16b instructions in jr instruction delay slot 16 bit instructions are not allowed in jr delay slot. Same stands for PseudoIndirectBranch and PseudoReturn. Differential Revision: http://reviews.llvm.org/D6815 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225798 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 15:59:17 +00:00
Reid Kleckner	d8f69a7201	Rename llvm.recoverframeallocation to llvm.framerecover This name is less descriptive, but it sort of puts things in the 'llvm.frame...' namespace, relating it to frameallocate and frameaddress. It also avoids using "allocate" and "allocation" together. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225752 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 01:51:34 +00:00
Reid Kleckner	221a7075cf	Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225746 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 00:48:10 +00:00
Matt Arsenault	29ad7506e1	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225744 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 00:43:00 +00:00
Reid Kleckner	1ec250a32f	musttail: Only set the inreg flag for fastcall and vectorcall Otherwise we'll attempt to forward ECX, EDX, and EAX for cdecl and stdcall thunks, leaving us with no scratch registers for indirect call targets. Fixes PR22052. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225729 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 23:28:23 +00:00
Adrian Prantl	f89325d832	Debug info: Factor out the creation of DWARF expressions from AsmPrinter into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225706 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 22:19:22 +00:00
Ahmed Bougacha	cd5bbd8bad	[X86] Also create+widen FMIN/FMAX nodes for v2f32. This happens in the HINT benchmark, where the SLP-vectorizer created v2f32 fcmp/select code. The "correct" solution would have been to teach the vectorizer cost model that v2f32 isn't legal (because really, it isn't), but if we can vectorize we might as well do so. We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on. v3f32 were already widened to v4f32 by the generic unroll-and-build-vector legalization. rdar://15763436 Differential Revision: http://reviews.llvm.org/D6557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225691 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 20:31:30 +00:00
Ahmed Bougacha	5316023a4e	[X86] Make SSE min/max testcases more explicit. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225687 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 20:15:47 +00:00
Tom Stellard	d275e025d2	R600/SI: Use RegisterOperands to specify which operands can accept immediates There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225662 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 19:33:18 +00:00
Hal Finkel	b6bb7db62b	[PowerPC] Fix calls to non-function objects Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225617 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 04:34:47 +00:00
David Majnemer	85a0cb9bf2	Revert most of r225597 We can't rely on a DataLayout enlightened constant folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225599 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 07:29:51 +00:00
David Majnemer	d2f4460ee7	X86: Properly decode shuffle masks when the constant pool type is weird It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225597 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 05:08:57 +00:00
Saleem Abdulrasool	776673ea09	X86: teach X86TargetLowering about L,M,O constraints Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225596 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 04:39:24 +00:00
Chandler Carruth	561088eb5d	[x86] Remove some windows line endings that snuck into the tests here. Folks on Windows, remember to set up your subversion to strip these when submitting... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225593 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 01:36:20 +00:00
Sanjoy Das	7f0da20b97	Fix PR22179. We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225591 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 23:41:24 +00:00
Simon Pilgrim	47abf0e3da	[X86][SSE] Improved (v)insertps shuffle matching In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225589 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 19:45:33 +00:00
Hal Finkel	9ae5b7a40a	[PowerPC] Mark zext of a small scalar load as free This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225584 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 08:21:59 +00:00

... 5 6 7 8 9 ...

13020 Commits