llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-23 17:29:19 +00:00

Author	SHA1	Message	Date
Colin LeMahieu	f309d8ee65	[Hexagon] Adding zxth instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222662 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 17:11:34 +00:00
Colin LeMahieu	a723df08bb	[Hexagon] Adding zxtb instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222660 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 16:48:43 +00:00
Jozef Kolek	c19526770e	[mips][microMIPS] Fix JRADDIUSP instruction Fix JRADDIUSP instruction, remove delay slot flag because this instruction doesn't have delay slot. Differential Revision: http://reviews.llvm.org/D6365 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222658 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 16:14:10 +00:00
Jozef Kolek	b955bed064	[mips][microMIPS] Implement LBU16, LHU16, LW16, SB16, SH16 and SW16 instructions Differential Revision: http://reviews.llvm.org/D5122 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222653 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 14:39:13 +00:00
Jozef Kolek	d49e74eaa5	[mips][microMIPS] Implement 16-bit instructions registers including ZERO instead of S0 Implement microMIPS 16-bit instructions register set: $0, $2-$7 and $17. Differential Revision: http://reviews.llvm.org/D5780 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222652 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 14:25:53 +00:00
Aaron Ballman	8d6f6a27c7	Removing a variable that is initialized but never read. The original author has been alerted to the warning, in case this variable is meant to be used. Fixes -Werror builds in the meantime. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222649 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 14:03:16 +00:00
Jozef Kolek	18700de8fc	[mips][microMIPS] Implement disassembler support for 16-bit instructions With the help of new method readInstruction16() two bytes are read and decodeInstruction() is called with DecoderTableMicroMips16, if this fails four bytes are read and decodeInstruction() is called with DecoderTableMicroMips32. Differential Revision: http://reviews.llvm.org/D6149 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222648 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 13:29:59 +00:00
Andrea Di Biagio	a1e1f01699	[X86] Improved target specific combine on VSELECT dag nodes. This patch teaches function 'transformVSELECTtoBlendVECTOR_SHUFFLE' how to convert VSELECT dag nodes to shuffles on targets that do not have SSE4.1. On pre-SSE4.1 targets, we can still perform blend operations using movss/movsd. Also, removed a target specific combine that performed a premature lowering of VSELECT nodes to target specific MOVSS/MOVSD nodes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222647 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-24 12:23:15 +00:00
Michael Kuperstein	d539147834	[X86] Fixes bug in build_vector v4x32 lowering r222375 made some improvements to build_vector lowering of v4x32 and v4xf32 into an insertps, but it missed a case where: 1. A single extracted element is used twice. 2. The lower of the two non-zero indexes should be preserved, and the higher should be used for the dest mask. This caused a crash, since the source value for the insertps ends-up uninitialized. Differential Revision: http://reviews.llvm.org/D6377 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222635 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-23 13:09:06 +00:00
Craig Topper	71777d18ad	Add missing override keywords. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-23 09:40:13 +00:00
Elena Demikhovsky	ae1ae2c3a1	Masked Vector Load and Store Intrinsics. Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores. Added SDNodes for masked operations and lowering patterns for X86 code generator. Examples: <16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align /, <16 x i1> %mask) declare void @llvm.masked.store.v8f64(i8 %addr, <8 x double> %value, i32 4, <8 x i1> %mask) Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch. http://reviews.llvm.org/D6191 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222632 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-23 08:07:43 +00:00
Matt Arsenault	4f5aa5994e	R600: Fix extloads of i1 on R600/Evergreen git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222631 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-23 02:57:54 +00:00
Matt Arsenault	716ce08250	R600: Fix assert on copy of an i1 on pre-SI i1 is not a legal type on Evergreen, so this combine proceeded and tried to produce a bitcast between i1 and i8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222630 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-23 02:57:52 +00:00
Simon Pilgrim	53a43d38df	Tidied up target triple OS detection. NFC Use Triple::isOS*() helper functions where possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222622 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-22 19:12:10 +00:00
Chandler Carruth	e915b4b7c8	[x86] Teach the vector shuffle yet another step of canonicalization. No functionality changed yet, but this will prevent subsequent patches from having to handle permutations of various interleaved shuffle patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222614 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-22 09:18:53 +00:00
Joerg Sonnenberger	0b1407b5cf	Fix transformation of add with pc argument to adr for non-immediate arguments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222587 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:39:34 +00:00
Tom Stellard	bad4e7b748	R600/SI: Add an s_mov_b32 to patterns which use the M0RegClass We need to use a s_mov_b32 rather than a copy, so that CSE will eliminate redundant moves to the m0 register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222584 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:31:46 +00:00
Tom Stellard	573630a020	R600/SI: Emit s_mov_b32 m0, -1 before every DS instruction This s_mov_b32 will write to a virtual register from the M0Reg class and all the ds instructions now take an extra M0Reg explicit argument. This change is necessary to prevent issues with the scheduler mixing together instructions that expect different values in the m0 registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222583 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:31:44 +00:00
Tom Stellard	edcd88ce1a	R600/SI: Add SIFoldOperands pass This pass attempts to fold the source operands of mov and copy instructions into their uses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222581 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:06:37 +00:00
Jozef Kolek	d9accc1e5f	[mips][microMIPS] This patch implements functionality in MIPS delay slot filler such as if delay slot filler have to put NOP instruction into the delay slot of microMIPS BEQ or BNE instruction which uses the register $0, then instead of emitting NOP this instruction is replaced by the corresponding microMIPS compact branch instruction, i.e. BEQZC or BNEZC. Differential Revision: http://reviews.llvm.org/D3566 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222580 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:04:35 +00:00
Tom Stellard	e83cdb9792	R600/SI: Mark s_mov_b32 and s_mov_b64 as rematerializable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222579 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 22:00:16 +00:00
Colin LeMahieu	88109da602	[Hexagon] Adding sxth instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 21:54:59 +00:00
Colin LeMahieu	326816c88f	[Hexagon] Adding sxtb instruction. Renaming some identically named classes that will be removed after converting referencing defs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 21:35:52 +00:00
Colin LeMahieu	a3b0670792	[Hexagon] Removing SUB_rr and replacing with A2_sub. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222571 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 21:19:18 +00:00
Sanjay Patel	28660d4b2f	Add a feature flag for slow 32-byte unaligned memory accesses [x86]. This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222544 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 17:40:04 +00:00
Chandler Carruth	46c5a97adc	[x86] Restructure the checking patterns for v16 and v32 avx2 vector shuffle lowering to allow much better blend matching. Specifically, with the new structure the code seems clearer to me and we correctly can hit the cases where merging two 128-bit lanes is a clear win and can be shuffled cheaply afterward. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222539 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 14:53:03 +00:00
Chandler Carruth	0889d65fd5	[x86] Make the previous logic significantly less conservative and get a bunch more improvements. Non-lane-crossing is fine, the key is that lane merging only makes sense for single-input shuffles. Not sure why I got so turned around here. The code all works, I was just using the wrong model for it. This only updates v4 and v8 lowering. The v16 and v32 lowering requires restructuring the entire check sequence. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222537 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 14:33:24 +00:00
Chandler Carruth	bd357588a1	[x86] Teach the x86 vector shuffle lowering to detect mergable 128-bit lanes. By special casing these we can often either reduce the total number of shuffles significantly or reduce the number of (high latency on Haswell) AVX2 shuffles that potentially cross 128-bit lanes. Even when these don't actually cross lanes, they have much higher latency to support that. Doing two of them and a blend is worse than doing a single insert across the 128-bit lanes to blend and then doing a single interleaved shuffle. While this seems like a narrow case, it kept cropping up on me and the difference is huge as you can see in many of the test cases. I first hit this trying to perfectly fix the interleaving shuffle patterns used by Halide for AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222533 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 13:56:05 +00:00
Alexey Volkov	d0d0424368	[X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers Differential Revision: http://reviews.llvm.org/D5938 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222521 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 11:19:34 +00:00
NAKAMURA Takumi	3bc8b4b38b	Add LLVMScalarOpts to LLVMPowerPCCodeGen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222516 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 09:14:45 +00:00
Hao Liu	09ad94decb	DAGCombiner: Allow the DAGCombiner to combine multiple FDIVs with the same divisor info FMULs by the reciprocal. E.g., ( a / D; b / D ) -> ( recip = 1.0 / D; a * recip; b * recip) A hook is added to allow the target to control whether it needs to do such combine. Reviewed in http://reviews.llvm.org/D6334 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222510 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 06:39:58 +00:00
Craig Topper	e0ed7df6b0	Remove a bunch of unnecessary typecasts to 'const TargetRegisterClass *' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222509 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 05:58:21 +00:00
Hal Finkel	361eafaffa	[PPC] Use SeparateConstOffsetFromGEP This mirrors r222331, which enabled SeparateConstOffsetFromGEP on AArch64, in the PowerPC backend. Yields, on a POWER7 machine, a 30% speedup on SingleSource/Benchmarks/Shootout/nestedloop (this might just be from LICM, there is a store moved out of the inner loop) and a potential speedup on MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode. Regardless, it makes some code look cleaner, and synchronizing the backends in this regard seems like a generally good thing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222504 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 04:35:51 +00:00
Quentin Colombet	c91f34ae54	[X86] Do not custom lower UINT_TO_FP when the target type does not match the custom lowering. <rdar://problem/19026326> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 00:47:19 +00:00
Reid Kleckner	9c390888f7	Fix more instances of -Wsentinel on Windows with s/NULL/nullptr/ Follow up to r221940, where I must not have caught em all. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222481 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 23:51:47 +00:00
Reid Kleckner	d12434058d	Add out of line virtual destructors to all LLVMTargetMachine subclasses These recently all grew a unique_ptr<TargetLoweringObjectFile> member in r221878. When anyone calls a virtual method of a class, clang-cl requires all virtual methods to be semantically valid. This includes the implicit virtual destructor, which triggers instantiation of the unique_ptr destructor, which fails because the type being deleted is incomplete. This is just part of the ongoing saga of PR20337, which is affecting Blink as well. Because the MSVC ABI doesn't have key functions, we end up referencing the vtable and implicit destructor on any virtual call through a class. We don't actually end up emitting the dtor, so it'd be good if we could avoid this unneeded type completion work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222480 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 23:37:18 +00:00
Mehdi Amini	10754c251b	Update Makefile following directory removal in r222466 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222475 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 22:48:24 +00:00
Colin LeMahieu	a3712a1fb9	[Hexagon] [NFC] Merging InstPrinter directory in to MCTargetDesc since they have a circular dependency. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222458 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 21:56:35 +00:00
Saleem Abdulrasool	e6c1fc9a44	X86: use the correct alloca symbol for Windows Itanium Windows itanium targets the MSVCRT, and the stack probe symbol is provided by MSVCRT. This corrects the emission of stack probes on i686-windows-itanium. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222439 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 18:01:26 +00:00
Jyoti Allur	dc0b300304	[ELF] Prevent ARM ELF object writer from generating deprecated relocation code R_ARM_PLT32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222414 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 05:58:11 +00:00
Craig Topper	136d5aeba4	Fix a typo in a comment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222412 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-20 05:22:37 +00:00
Colin LeMahieu	e8cdd171f9	[Hexagon] Adding A2_xor instruction with IR selection pattern and test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222399 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 23:22:23 +00:00
Colin LeMahieu	fb1c650fd0	[Hexagon] Adding A2_or instruction with IR selection pattern and test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222396 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 22:58:04 +00:00
Andrea Di Biagio	53daaff125	[X86] Improved lowering of v4x32 build_vector dag nodes. This patch improves the lowering of v4f32 and v4i32 build_vector dag nodes that are known to have at least two non-zero elements. With this patch, a build_vector that performs a blend with zero is converted into a shuffle. This is done to let the shuffle legalizer expand the dag node in a optimal way. For example, if we know that a build_vector performs a blend with zero, we can try to lower it as a movq/blend instead of always selecting an insertps. This patch also improves the logic that lowers a build_vector into a insertps with zero masking. See for example the extra test cases added to test sse41.ll. Differential Revision: http://reviews.llvm.org/D6311 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222375 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 19:34:29 +00:00
Tom Stellard	334ebf33ea	R600/SI: Make SIInstrInfo::isOperandLegal() more strict A register operand that has a common sub-class with its instruction's defined register class is not always legal. For example, SReg_32 and M0Reg both have a common sub-class, but we can't use an SReg_32 in instructions that expect a M0Reg. This prevents the llvm.SI.sendmsg.ll test from failing when the fold operand pass is added. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222368 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 16:58:49 +00:00
Zoran Jovanovic	d67cd80220	[mips][micromips] Implement SWM32 and LWM32 instructions Differential Revision: http://reviews.llvm.org/D5519 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222367 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 16:44:02 +00:00
Jozef Kolek	9fece51399	[mips][microMIPS] Fix opcodes of MFHC1 and MTHC1 instructions. Differential Revision: http://reviews.llvm.org/D6169 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222355 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 13:37:51 +00:00
Jozef Kolek	e4e84b22fe	[mips][microMIPS] Implement CodeGen support for 16-bit instruction ADDIUR2. Differential Revision: http://reviews.llvm.org/D5800 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222352 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 13:23:58 +00:00
Jozef Kolek	5c6c7e3295	[mips][microMIPS] Implement CodeGen support for ADDIUS5 instruction. Differential Revision: http://reviews.llvm.org/D5799 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222351 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 13:11:09 +00:00
Jozef Kolek	43ae00e4e0	[mips][microMIPS] Implement LWXS instruction. Differential Revision: http://reviews.llvm.org/D5407 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222348 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 11:39:12 +00:00
Jozef Kolek	baf97d8987	[mips][microMIPS] Implement SDBBP and RDHWR instructions. Differential Revision: http://reviews.llvm.org/D5240 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222347 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 11:25:50 +00:00
Simon Pilgrim	a6943fff90	[X86][SSE] pslldq/psrldq byte shifts/rotation for SSE2 This patch builds on http://reviews.llvm.org/D5598 to perform byte rotation shuffles (lowerVectorShuffleAsByteRotate) on pre-SSSE3 (palignr) targets - pre-SSSE3 is only enabled on i8 and i16 vector targets where it is a more definite performance gain. I've also added a separate byte shift shuffle (lowerVectorShuffleAsByteShift) that makes use of the ability of the SLLDQ/SRLDQ instructions to implicitly shift in zero bytes to avoid the need to create a zero register if we had used palignr. Differential Revision: http://reviews.llvm.org/D5699 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222340 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 10:06:49 +00:00
David Blaikie	5401ba7099	Update SetVector to rely on the underlying set's insert to return a pair<iterator, bool> This is to be consistent with StringSet and ultimately with the standard library's associative container insert function. This lead to updating SmallSet::insert to return pair<iterator, bool>, and then to update SmallPtrSet::insert to return pair<iterator, bool>, and then to update all the existing users of those functions... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222334 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 07:49:26 +00:00
Hao Liu	0e8675a621	[AArch64] Disable useAA for Cortex-A57. Using AA during CodeGen is very useful for in-order cores. It is less useful for ooo cores. Also I find enabling useAA for Cortex-A57 may generate worse code for some test cases. If useAA in codegen is improved and benefical for ooo cores, we can enable it again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222333 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 06:48:56 +00:00
Hao Liu	8db9fbf7cd	[AArch64] Enable SeparateConstOffsetFromGEP, EarlyCSE and LICM passes on AArch64 backend. SeparateConstOffsetFromGEP can gives more optimizaiton opportunities related to GEPs, which benefits EarlyCSE and LICM. By enabling these passes we can have better address calculations and generate a better addressing mode. Some SPEC 2006 benchmarks (astar, gobmk, namd) have obvious improvements on Cortex-A57. Reviewed in http://reviews.llvm.org/D5864. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222331 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 06:39:53 +00:00
David Blaikie	1d4f28c6bc	Remove StringMap::GetOrCreateValue in favor of StringMap::insert Having two ways to do this doesn't seem terribly helpful and consistently using the insert version (which we already has) seems like it'll make the code easier to understand to anyone working with standard data structures. (I also updated many references to the Entry's key and value to use first() and second instead of getKey{Data,Length,} and get/setValue - for similar consistency) Also removes the GetOrCreateValue functions so there's less surface area to StringMap to fix/improve/change/accommodate move semantics, etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222319 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 05:49:42 +00:00
Weiming Zhao	d8e31c73cd	[Aarch64] Customer lowering of CTPOP to SIMD should check for NEON availability git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222292 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 00:29:14 +00:00
Matt Arsenault	1bd96c574c	R600/SI: Implement areMemAccessesTriviallyDisjoint This partially makes up for not having address spaces used for alias analysis in some simple cases. This is not yet enabled by default so shouldn't change anything yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222286 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-19 00:01:31 +00:00
Matt Arsenault	b556213712	R600/SI: Set hasSideEffects = 0 on load and store instructions. Assuming unmodeled side effects interferes with some scheduling opportunities. Don't put it in the base class of DS instructions since there are a few weird effecting, non load/store instructions there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222285 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 23:57:33 +00:00
Simon Pilgrim	e6d1a2625f	[X86][AVX] 256-bit vector stack unaligned load/stores identification Under many circumstances the stack is not 32-byte aligned, resulting in the use of the vmovups/vmovupd/vmovdqu instructions when inserting ymm reloads/spills. This minor patch adds these instructions to the isFrameLoadOpcode/isFrameStoreOpcode helpers so that they can be correctly identified and not be treated as folded reloads/spills. This has also been noticed by http://llvm.org/bugs/show_bug.cgi?id=18846 where it was causing redundant spills - I've added a reduced test case at test/CodeGen/X86/pr18846.ll Differential Revision: http://reviews.llvm.org/D6252 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222281 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 23:38:19 +00:00
Colin LeMahieu	642bb08576	[Hexagon] Adding A2_and instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222274 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 22:45:47 +00:00
Chad Rosier	32dc2de667	[FastISel][AArch64] Also allow folding of sign-/zero-extend and arithmetic shift-right for booleans (i1). Arithmetic shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222272 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 22:41:49 +00:00
Chad Rosier	5e3288f85b	[FastISel][AArch64] Also allow folding of sign-/zero-extend and logical shift-right for booleans (i1). Logical shift-right immediate with sign-/zero-extensions also works for boolean values. Update the assert and the test cases to reflect that fact. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222270 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 22:38:42 +00:00
Colin LeMahieu	ed37b1e2d0	[Hexagon] Adding A2_sub instruction Renaming test files. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222263 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 21:51:51 +00:00
Juergen Ributzka	52e0f75f82	[FastISel][AArch64] Follow-up fix for "Fix shift-immediate emission for "zero" shifts." Shifts also perform sign-/zero-extends to larger types, which requires us to emit an integer extend instead of a simple COPY. Related to PR21594. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222257 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 21:20:17 +00:00
Matt Arsenault	a140448780	R600/SI: Move SIFixSGPRCopies to inst selector passes This should expose more of the actually used VALU instructions to the machine optimization passes. This also should help getting i1 handling into a better state. For not entirly understood reasons, this fixes the split-scalar-i64-add.ll test where a 64-bit add would only partially be moved to the VALU resulting in use of undefined VCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 21:06:58 +00:00
Juergen Ributzka	50fa2ff5d2	[AArch64] Don't optimize all compare instructions. "optimizeCompareInstr" converts compares (cmp/cmn) into plain sub/add instructions when the flags are not used anymore. This conversion is valid for most instructions, but not all. Some instructions that don't set the flags (e.g. sub with immediate) can set the SP, whereas the flag setting version uses the same encoding for the "zero" register. Update the code to also check for the return register before performing the optimization to make sure that a cmp doesn't suddenly turn into a sub that sets the stack pointer. I don't have a test case for this, because it isn't easy to trigger. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222255 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 21:02:40 +00:00
Tom Stellard	891e9e7869	R600/SI: Make sure resource descriptors are always stored in SGPRs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222253 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 20:39:39 +00:00
Colin LeMahieu	b7927f100d	[Hexagon] Converting from ADD_rr to A2_add which has encoding bits. Adding test to show correct instruction selection and encoding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222249 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 20:28:11 +00:00
Juergen Ributzka	8b62d78689	[FastISel][AArch64] Fix shift-immediate emission for "zero" shifts. This change emits a COPY for a shift-immediate with a "zero" shift value. This fixes PR21594 where we emitted a shift instruction with an incorrect immediate operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222247 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 19:58:59 +00:00
Jozef Kolek	c8ec320371	Test commit to verify that commit access works. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222244 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-18 19:20:34 +00:00
Reid Kleckner	8083adcaca	Revert "ADT: correctly report isMSVCEnvironment for windows itanium" This reverts commit r222180. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222188 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 22:55:59 +00:00
Saleem Abdulrasool	2bd09db07f	ADT: correctly report isMSVCEnvironment for windows itanium The itanium environment on Windows uses MSVC and is a MSVC environment. Report this correctly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222180 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 22:13:26 +00:00
Matt Arsenault	84230f9a53	R600/SI: Don't copy flags when extracting subreg This was resulting in use of a register after a kill. For some reason this showed up as a problem in many tests when moving the SIFixSGPRCopies pass closer to instruction selection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222175 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 21:11:37 +00:00
Matt Arsenault	6a95eb021b	R600/SI: Assume SIFixSGPRCopies makes changes I'm not sure if this was breaking anything. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222174 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 21:11:34 +00:00
Alexey Volkov	19e8fe05dc	[X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs Differential Revision: http://reviews.llvm.org/D5934 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222141 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 16:17:51 +00:00
Oliver Stannard	8f832fce3b	[Thumb1] Re-write emitThumbRegPlusImmediate This was motivated by a bug which caused code like this to be miscompiled: declare void @take_ptr(i8) define void @test() { %addr1.32 = alloca i8 %addr2.32 = alloca i32, i32 1028 call void @take_ptr(i8 %addr1) ret void } This was emitting the following assembly to get the value of %addr1: add r0, sp, #1020 add r0, r0, #8 However, "add r0, r0, #8" is not a valid Thumb1 instruction, and this could not be assembled. The generated object file contained this, resulting in r0 holding SP+8 rather tha SP+1028: add r0, sp, #1020 add r0, sp, #8 This function looked like it could have caused miscompilations for other combinations of registers and offsets (though I don't think it is currently called with these), and the heuristic it used did not match the emitted code in all cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222125 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 11:18:10 +00:00
Craig Topper	56391ddf5d	Convert some EVTs to MVTs where only a SimpleValueType is needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222109 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-16 21:17:18 +00:00
Craig Topper	a51382a63a	[x86] Remove two redundant isel patterns. They equivalent already exists in the instruction pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222094 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-16 09:24:16 +00:00
Simon Pilgrim	01e39346f3	[X86][SSE] Improve legal SHUFP and PSHUFD shuffle matching Updated X86TargetLowering::isShuffleMaskLegal to match SHUFP masks with commuted inputs and PSHUFD masks that reference the second input. As part of this I've refactored isPSHUFDMask to work in a more general manner and allow it to match against either the first or second input vector. Differential Revision: http://reviews.llvm.org/D6287 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222087 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-15 21:13:05 +00:00
Matt Arsenault	c093062447	R600: Permute operands when selecting legacy min/max This gets the correct NaN behavior based on the compare type the hardware uses. This now passes the new piglit test I have for this on SI. Add stricter tests for the operand order. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222079 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-15 05:02:57 +00:00
Tom Stellard	be11bdfe20	R600: Fix 64-bit integer division This fixes a failure in one of the oclconform tests. Patch by: Jan Vesely git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222073 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-15 01:07:57 +00:00
Tom Stellard	57b2b0c4da	R600: Factor i64 UDIVREM lowering into its own fuction This is so it could potentially be used by SI. However, the current implementation does not always produce correct results, so the IntegerDivisionPass is being used instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-15 01:07:53 +00:00
Reid Kleckner	0737b4ee14	Rename EH related stuff to be more precise Summary: The current "WinEH" exception handling type is more about Itanium-style LSDA tables layered on top of the Windows native unwind info format instead of .eh_frame tables or EHABI unwind info. Use the name "ItaniumWinEH" to better reflect the hybrid nature of the design. Also rename isExceptionHandlingDWARF to usesItaniumLSDAForExceptions, since the LSDA is part of the Itanium C++ ABI document, and not the DWARF standard. Reviewers: echristo Subscribers: llvm-commits, compnerd Differential Revision: http://reviews.llvm.org/D6279 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222062 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 23:31:07 +00:00
Tim Northover	52e76186d3	ARM: refactor .cfi_def_cfa_offset emission. We use to track quite a few "adjusted" offsets through the FrameLowering code to account for changes in the prologue instructions as we went and allow the emission of correct CFA annotations. However, we were missing a couple of cases and the code was almost impenetrable. It's easier to just add any stack-adjusting instruction to a list and emit them together. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222057 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 22:45:33 +00:00
Tim Northover	d96893fd3d	ARM: correctly calculate the offset of FP in its push. When we folded the DPR alignment gap into a push, we weren't noting the extra distance from the beginning of the push to the FP, and so FP ended up pointing at an incorrect offset. The .cfi_def_cfa_offset directives are still wrong in this case, but I think that can be improved by refactoring. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 22:45:31 +00:00
Tom Stellard	99b3234323	R600/SI: Mark s_movk_i32 as rematerializable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 20:43:28 +00:00
Tom Stellard	6beb81daa5	R600/SI: Fix spilling of m0 register If we have spilled the value of the m0 register, then we need to restore it with v_readlane_b32 to a regular sgpr, because v_readlane_b32 can't write to m0. v_readlane_b32 can't write to m0, so git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222036 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 20:43:26 +00:00
Matt Arsenault	24e874a1dd	R600/SI: Combine min3/max3 instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222032 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 20:08:52 +00:00
Matt Arsenault	848d9223c5	R600/SI: Fix verifier error from a branch on IMPLICIT_DEF SIILowerI1Copies wasn't correctly handling this case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222020 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 18:43:41 +00:00
Matt Arsenault	3dd7f8668b	Fix unused variable warning without asserts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 18:40:49 +00:00
Matt Arsenault	01213b1132	R600/SI: Match integer min / max instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 18:30:06 +00:00
Matt Arsenault	8fd3b90c3f	R600/SI: Use S_BFE_I64 for 64-bit sext_inreg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222012 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 18:18:16 +00:00
Cameron McInally	b3625eb445	[AVX512] Add 512b masked integer shift by immediate patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222002 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 15:43:00 +00:00
Tom Stellard	239d5231f3	R600/SI: Fix assembly names for exec_hi and exec_lo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221995 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 14:08:04 +00:00
Tom Stellard	19cb35b4bc	R600/SI: Start implementing an assembler This was done using the Sparc and PowerPC AsmParsers as guides. So far it is very simple and only supports sopp instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221994 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 14:08:00 +00:00
Bill Schmidt	40b0f5d6ce	[PowerPC] Add VSX builtins for vec_div This patch adds builtin support for xvdivdp and xvdivsp, along with a test case. Straightforward stuff. There's a companion patch for Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221983 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 12:10:40 +00:00
Matt Arsenault	d9cd6cfb7d	R600/SI: Make constant array static git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221965 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 02:21:58 +00:00
Tim Northover	4a7bbf4c29	X86: use getConstant rather than getTargetConstant behind BUILD_VECTOR. getTargetConstant should only be used when you can guarantee the instruction selected will be able to cope with the raw value. BUILD_VECTOR is rather too generic for this so we should use getConstant instead. In that case, an instruction can still consume the constant, but if it doesn't it'll be materialised through its own round of ISel. Should fix PR21352. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221961 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 01:30:14 +00:00
Reid Kleckner	4f3c9858e0	Fix build of Mips code with MSVC by using our macro instead of __attribute__((unused)) directly git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221956 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-14 00:39:33 +00:00
Reed Kotler	198bb22754	First stage of call lowering for Mips fast-isel Summary: This has most of what is needed for mips fast-isel call lowering for O32. What is missing I will add on the next patch because this patch is already too large. It should not be doing anything wrong but it will punt on some cases that it is basically capable of doing. The mechanism is there for parameters to be passed on the stack but I have not enabled it because it serves as a way for now to prevent some of the strange cases of O32 register passing that I have not fully checked yet and have some issues. The Mips O32 abi rules are very complicated as far how data is passed in floating and integer registers. However there is a way to think about this all very simply and this implementation reflects that. Basically, the ABI rules are written as if everything is passed on the stack and aligned as such. Once that is conceptually done, it is nearly trivial to reassign those locations to registers and then all the complexity disappears. So I have told tablegen that all the data is passed on the stack and during the lowering I fix this by assigning to registers as per the ABI doc. This has been my approach and you can line up what I did with the ABI document and see 1 to 1 what is going on. Test Plan: callabi.ll Reviewers: dsanders Reviewed By: dsanders Subscribers: jholewinski, echristo, ahatanak, llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D5714 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221948 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 23:37:45 +00:00
Matt Arsenault	6f485c0bc5	R600/SI: Fix fmin_legacy / fmax_legacy matching for SI select_cc is expanded on SI, so this was never matched. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221941 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 23:03:09 +00:00
Aditya Nandakumar	365df40768	We can get the TLOF from the TargetMachine - so constructor no longer requires TargetLoweringObjectFile to be passed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221926 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 21:29:21 +00:00
Juergen Ributzka	add7c56be5	[FastISel][AArch64] Don't bail during simple GEP instruction selection. The generic FastISel code would bail, because it can't emit a sign-extend for AArch64. This copies the code over and uses AArch64 specific emit functions. This is not ideal and 'computeAddress' should handles this, so it can fold the address computation into the memory operation. I plan to clean up 'computeAddress' anyways, so I will add that in a future commit. Related to rdar://problem/18962471. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221923 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 20:50:44 +00:00
Matt Arsenault	01ab7a869d	R600/SI: Use s_movk_i32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221922 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 20:44:23 +00:00
Matt Arsenault	0fed4c6a45	R600/SI: Fix definition for s_cselect_b32 These were directly using the old base instruction class, and specifying the wrong register classes for operands. The operands can be the other special inputs besides SGPRs. The op name was also being directly used for the asm string, so this was printed without any operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221921 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 20:23:36 +00:00
Matt Arsenault	60c3acb36c	R600: Fix assert on empty function If a function is just an unreachable, this would hit a "this is not a MachO target" assertion because of setting HasSubsectionViaSymbols. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221920 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 20:07:40 +00:00
Matt Arsenault	8082990487	R600: Error on initializer for LDS. Also give a proper error for other address spaces. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221917 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 19:56:13 +00:00
Matt Arsenault	b44e43623d	R600/SI: Get rid of FCLAMP_SI pseudo It's not necessary. Also use complex patterns to allow src modifier usage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221916 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 19:49:04 +00:00
Matt Arsenault	e59f9f46f7	R600/SI: Allow commuting with src2_modifiers git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221911 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 19:26:50 +00:00
Matt Arsenault	1aae959de7	R600/SI: Allow commuting some 3 op instructions e.g. v_mad_f32 a, b, c -> v_mad_f32 b, a, c This simplifies matching v_madmk_f32. This looks somewhat surprising, but it appears to be OK to do this. We can commute src0 and src1 in all of these instructions, and that's all that appears to matter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221910 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 19:26:47 +00:00
Tim Northover	8bca5de6a9	ARM: allow constpool entry to be moved to the user's block in all cases. Normally entries can only move to a lower address, but when that wasn't viable, the user's block was considered anyway. Unfortunately, it went via createNewWater which wasn't designed to handle the case where there's already an island after the block. Unfortunately, the test we have is slow and fragile, and I couldn't reduce it to anything sane even with the @llvm.arm.space intrinsic. The test change here is recreating the previous one after the change. rdar://problem/18545506 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221905 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 17:58:53 +00:00
Tim Northover	064da63fcb	ARM: avoid duplicating branches during constant islands. We were using a naive heuristic to determine whether a basic block already had an unconditional branch at the end. This mostly corresponded to reality (assuming branches got optimised) because there's not much point in a branch to the next block, but could go wrong. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221904 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 17:58:51 +00:00
Tim Northover	5bd311bf17	ARM: add @llvm.arm.space intrinsic for testing ConstantIslands. Creating tests for the ConstantIslands pass is very difficult, since it depends on precise layout details. Having the ability to precisely inject a number of bytes into the stream helps greatly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221903 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 17:58:48 +00:00
Colin LeMahieu	511fa66c69	[Hexagon] NFC Renaming reserved identifier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221898 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 16:36:30 +00:00
Elena Demikhovsky	18e1185ddf	AVX-512: SINT_TO_FP cost model and some bugfixes Checked some corner cases, for example translation of <8 x i1> to <8 x double> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221883 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 11:46:16 +00:00
Aditya Nandakumar	847729d19a	This patch changes the ownership of TLOF from TargetLoweringBase to TargetMachine so that different subtargets could share the TLOF effectively git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221878 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 09:26:31 +00:00
Chandler Carruth	4ea3097d08	[x86] Teach the vector shuffle lowering to make a more nuanced decision between splitting a vector into 128-bit lanes and recombining them vs. decomposing things into single-input shuffles and a final blend. This handles a large number of cases in AVX1 where the cross-lane shuffles would be much more expensive to represent even though we end up with a fast blend at the root. Instead, we can do a better job of shuffling in a single lane and then inserting it into the other lanes. This fixes the remaining bits of Halide's regression captured in PR21281 for AVX1. However, the bug persists in AVX2 because I've made this change reasonably conservative. The cases where it makes sense in AVX2 to split into 128-bit lanes are much more rare because we can often do full permutations across all elements of the 256-bit vector. However, the particular test case in PR21281 is an example of one of the rare cases where it is always better to work in a single 128-bit lane. I'm going to try to teach the logic to detect and form the good code even in AVX2 next, but it will need to use a separate heuristic. Finally, there is one pesky regression here where we previously would craftily use vpermilps in AVX1 to shuffle both high and low halves at the same time. We no longer pull that off, and not for any really good reason. Ultimately, I think this is just another missing nuance to the selection heuristic that I'll try to add in afterward, but this change already seems strictly worth doing considering the magnitude of the improvements in common matrix math shuffle patterns. As always, please let me know if this causes a surprising regression for you. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221861 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 04:06:10 +00:00
Chandler Carruth	927a5f45e0	[x86] Don't form overly fragmented blends when splitting and re-combining shuffles because nothing was available in the wider vector type. The key observation (which I've put in the comments for future maintainers) is that at this point, no further combining is really possible. And so even though these shuffles trivially could be combined, we need to actually do that as we produce them when producing them this late in the lowering. This fixes another (huge) part of the Halide vector shuffle regressions. As it happens, this was already well covered by the tests, but I hadn't noticed how bad some of these got. The specific patterns that turn directly into unpckl/h patterns were occurring many times in common vector processing code. There are still more problems here sadly, but trying to incrementally tease them apart and it looks like this is the core of the problem in the splitting logic. There is some chance of regression here, you can see it in the test changes. Specifically, where we stop forming pshufb in some cases, it is possible that pshufb was in fact faster. Intel "says" that pshufb is slower than the instruction sequences replacing it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221852 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 02:42:08 +00:00
Juergen Ributzka	9bb95ddae4	[FastISel][AArch64] Optimize select when one of the operands is a 'true' or 'false' value. Optimize selects of i1 in the presence of 'true' and 'false' operands to simple logic operations. This fixes rdar://problem/18960150. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221848 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 00:36:46 +00:00
Juergen Ributzka	b80d6be6d7	[FastISel][AArch64] Fold the cmp into the select when possible. This folds the compare emission into the select emission when possible, so we can directly use the flags and don't have to emit a separate compare. Related to rdar://problem/18960150. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 00:36:43 +00:00
Juergen Ributzka	8d6824ea4c	[FastISel][AArch64] Extend 'select' lowering to support also i1 to i16. Related to rdar://problem/18960150. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221846 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-13 00:36:38 +00:00
Sanjay Patel	dab91bcc3a	Expose the number of Newton-Raphson iterations applied to the hardware's reciprocal estimate as a parameter (x86). This is a follow-on to r221706 and r221731 and discussed in more detail in PR21385. This patch also loosens the testcase checking for btver2. We know that the "1.0" will be loaded, but we can't tell exactly when, so replace the CHECK-NEXT specifiers with plain CHECKs. The CHECK-NEXT sequence relied on a quirk of post-RA-scheduling that may change independently of anything in these tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221819 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 21:39:01 +00:00
Ahmed Bougacha	f9e1e56ea1	Add fortified (__*_chk) library functions to TLI (NFC) One of them (__memcpy_chk) was already there, the others were checked by comparing function names. Note that the fortified libfuncs are now part of TLI, but are always available, because they aren't generated, only optimized into the non-checking versions. Differential Revision: http://reviews.llvm.org/D6179 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221817 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 21:23:34 +00:00
Cameron McInally	be30336912	[AVX512] Add integer shift by immediate intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 19:58:54 +00:00
Jingyue Wu	dc9e73b4f8	Fix broken doxygen annotations, NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221801 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 18:25:06 +00:00
Jingyue Wu	83c8e732dd	Disable indvar widening if arithmetics on the wider type are more expensive Summary: Reapply r221772. The old patch breaks the bot because the @indvar_32_bit test was run whether NVPTX was enabled or not. IndVarSimplify should not widen an indvar if arithmetics on the wider indvar are more expensive than those on the narrower indvar. For instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is twice as expensive as that on i32, because the hardware needs to simulate a 64-bit integer using two 32-bit integers. Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo. Fixes PR21148. Test Plan: Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics on the wider type are more expensive. This test is run only when NVPTX is enabled. Reviewers: jholewinski, eliben, meheff, atrick Reviewed By: atrick Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221799 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 18:09:15 +00:00
Justin Hibbits	fcd08c294a	Add support for small-model PIC for PowerPC. Summary: Large-model was added first. With the addition of support for multiple PIC models in LLVM, now add small-model PIC for 32-bit PowerPC, SysV4 ABI. This generates more optimal code, for shared libraries with less than about 16380 data objects. Test Plan: Test cases added or updated Reviewers: joerg, hfinkel Reviewed By: hfinkel Subscribers: jholewinski, mcrosier, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D5399 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221791 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 15:16:30 +00:00
Zoran Jovanovic	cb5fadfe6a	[mips][micromips] Add predicate 'InMicroMips' at CodeGen patterns for microMIPS instructions Differential Revision: http://reviews.llvm.org/D6198 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221780 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 13:30:10 +00:00
Chandler Carruth	556578ec0c	[x86] Start improving the matching of unpck instructions based on test cases from Halide folks. This initial step was extracted from a prototype change by Clay Wood to try and address regressions found with Halide and the new vector shuffle lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221779 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 10:05:18 +00:00
Elena Demikhovsky	5f9c438577	AVX-512: Intrinsics for ERI 3 instructions: vrcp28, vrsqrt28, vexp2, only vector forms. Intrinsics include SAE (Suppres All Exceptions) parameter. http://reviews.llvm.org/D6214 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221774 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 07:31:03 +00:00
Jingyue Wu	ee709fe497	Reverts r221772 which fails tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 07:19:25 +00:00
Jingyue Wu	69adc159ee	Disable indvar widening if arithmetics on the wider type are more expensive Summary: IndVarSimplify should not widen an indvar if arithmetics on the wider indvar are more expensive than those on the narrower indvar. For instance, although NVPTX64 treats i64 as a legal type, an ADD on i64 is twice as expensive as that on i32, because the hardware needs to simulate a 64-bit integer using two 32-bit integers. Split from D6188, and based on D6195 which adds NVPTXTargetTransformInfo. Fixes PR21148. Test Plan: Added @indvar_32_bit that verifies we do not widen an indvar if the arithmetics on the wider type are more expensive. Reviewers: jholewinski, eliben, meheff, atrick Reviewed By: atrick Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221772 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 06:58:45 +00:00
Bill Schmidt	fc22bfd921	[PowerPC] Add vec_vsx_ld and vec_vsx_st intrinsics This patch enables the vec_vsx_ld and vec_vsx_st intrinsics for PowerPC, which provide programmer access to the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. New LLVM intrinsics are provided to represent these four instructions in IntrinsicsPowerPC.td. These are patterned after the similar intrinsics for lvx and stvx (Altivec). In PPCInstrVSX.td, these intrinsics are tied to the code gen patterns, with additional patterns to allow plain vanilla loads and stores to still generate these instructions. At -O1 and higher the intrinsics are immediately converted to loads and stores in InstCombineCalls.cpp. This will open up more optimization opportunities while still allowing the correct instructions to be generated. (Similar code exists for aligned Altivec loads and stores.) The new intrinsics are added to the code that checks for consecutive loads and stores in PPCISelLowering.cpp, as well as to PPCTargetLowering::getTgtMemIntrinsic(). There's a new test to verify the correct instructions are generated. The loads and stores tend to be reordered, so the test just counts their number. It runs at -O2, as it's not very effective to test this at -O0, when many unnecessary loads and stores are generated. I ended up having to modify vsx-fma-m.ll. It turns out this test case is slightly unreliable, but I don't know a good way to prevent problems with it. The xvmaddmdp instructions read and write the same register, which is one of the multiplicands. Commutativity allows either to be chosen. If the FMAs are reordered differently than expected by the test, the register assignment can be different as a result. Hopefully this doesn't change often. There is a companion patch for Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221767 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 04:19:40 +00:00
Rafael Espindola	6a222ec893	Pass an ArrayRef to MCDisassembler::getInstruction. With this patch MCDisassembler::getInstruction takes an ArrayRef<uint8_t> instead of a MemoryObject. Even on X86 there is a maximum size an instruction can have. Given that, it seems way simpler and more efficient to just pass an ArrayRef to the disassembler instead of a MemoryObject and have it do a virtual call every time it wants some extra bytes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221751 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 02:04:27 +00:00
Rafael Espindola	dc70865b5b	Remove a bit of dead code. Every "real" object file implements this an ptx doesn't use it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221746 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-12 01:27:22 +00:00
Sanjay Patel	4e9c3e9cee	Initialize new subtarget feature variable for generating reciprocal estimate instructions. This was missed in r221706. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221731 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 23:13:15 +00:00
Juergen Ributzka	4f0d671b97	[FastISel][AArch64] Add support for fabs intrinsic. Lower the llvm.fabs intrinsic to the 'fabs' MI instruction. This fixes rdar://problem/18946552. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 23:10:44 +00:00
Duncan P. N. Exon Smith	5bf8ade9d0	Revert "IR: MDNode => Value" Instead, we're going to separate metadata from the Value hierarchy. See PR21532. This reverts commit r221375. This reverts commit r221373. This reverts commit r221359. This reverts commit r221167. This reverts commit r221027. This reverts commit r221024. This reverts commit r221023. This reverts commit r220995. This reverts commit r220994. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221711 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 21:30:22 +00:00
Tom Roeder	63dea2c952	Add Forward Control-Flow Integrity. This commit adds a new pass that can inject checks before indirect calls to make sure that these calls target known locations. It supports three types of checks and, at compile time, it can take the name of a custom function to call when an indirect call check fails. The default failure function ignores the error and continues. This pass incidentally moves the function JumpInstrTables::transformType from private to public and makes it static (with a new argument that specifies the table type to use); this is so that the CFI code can transform function types at call sites to determine which jump-instruction table to use for the check at that site. Also, this removes support for jumptables in ARM, pending further performance analysis and discussion. Review: http://reviews.llvm.org/D4167 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221708 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 21:08:02 +00:00
Sanjay Patel	e7c966f067	Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385). This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:51:00 +00:00
Bill Schmidt	10161a0cce	[PowerPC] Replace foul hackery with real calls to __tls_get_addr My original support for the general dynamic and local dynamic TLS models contained some fairly obtuse hacks to generate calls to __tls_get_addr when lowering a TargetGlobalAddress. Rather than generating real calls, special GET_TLS_ADDR nodes were used to wrap the calls and only reveal them at assembly time. I attempted to provide correct parameter and return values by chaining CopyToReg and CopyFromReg nodes onto the GET_TLS_ADDR nodes, but this was also not fully correct. Problems were seen with two back-to-back stores to TLS variables, where the call sequences ended up overlapping with unhappy results. Additionally, since these weren't real calls, the proper register side effects of a call were not recorded, so clobbered values were kept live across the calls. The proper thing to do is to lower these into calls in the first place. This is relatively straightforward; see the changes to PPCTargetLowering::LowerGlobalTLSAddress() in PPCISelLowering.cpp. The changes here are standard call lowering, except that we need to track the fact that these calls will require a relocation. This is done by adding a machine operand flag of MO_TLSLD or MO_TLSGD to the TargetGlobalAddress operand that appears earlier in the sequence. The calls to LowerCallTo() eventually find their way to LowerCall_64SVR4() or LowerCall_32SVR4(), which call FinishCall(), which calls PrepareCall(). In PrepareCall(), we detect the calls to __tls_get_addr and immediately snag the TargetGlobalTLSAddress with the annotated relocation information. This becomes an extra operand on the call following the callee, which is expected for nodes of type tlscall. We change the call opcode to CALL_TLS for this case. Back in FinishCall(), we change it again to CALL_NOP_TLS for 64-bit only, since we require a TOC-restore nop following the call for the 64-bit ABIs. During selection, patterns in PPCInstrInfo.td and PPCInstr64Bit.td convert the CALL_TLS nodes into BL_TLS nodes, and convert the CALL_NOP_TLS nodes into BL8_NOP_TLS nodes. This replaces the code removed from PPCAsmPrinter.cpp, as the BL_TLS or BL8_NOP_TLS nodes can now be emitted normally using their patterns and the associated printTLSCall print method. Finally, as a result of these changes, all references to get-tls-addr in its various guises are no longer used, so they have been removed. There are existing TLS tests to verify the changes haven't messed anything up). I've added one new test that verifies that the problem with the original code has been fixed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221703 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:44:09 +00:00
Rafael Espindola	71c70733b7	Use a 8 bit immediate when possible. This fixes pr21529. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221700 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 19:46:36 +00:00
Dario Domizioli	949d328bee	[X86][ELF] Fix PR20243 - leaf frame pointer bug with TLS access The ISel lowering for global TLS access in PIC mode was creating a pseudo instruction that is later expanded to a call, but the code was not setting the hasCalls flag in the MachineFrameInfo alongside the adjustsStack flag. This caused some functions to be mistakenly recognized as leaf functions, and this in turn affected the decision to eliminate the frame pointer. With the fix, hasCalls is properly set and the leaf frame pointer is correctly preserved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221695 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 18:44:49 +00:00
Vasileios Kalintiris	328bc2f89e	[mips] Add preliminary support for the MIPS II target. Summary: This patch enables code generation for the MIPS II target. Pre-Mips32 targets don't have the MUL instruction, so we add the correspondent pattern that uses the MULT/MFLO combination in order to retrieve the product. This is WIP as we don't support code generation for select nodes due to the lack of conditional-move instructions. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6150 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221686 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 11:43:55 +00:00
Vasileios Kalintiris	b001cb6423	[mips] Add hardware register name "hwr_ulr" ($29) The canonical name when printing assembly is still $29. The reason is that GAS does not accept "$hwr_ulr" at the moment. This addresses the comments from r221307, which reverted the original commit r221299. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221685 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 11:22:39 +00:00
Andrea Di Biagio	d6548ad013	[X86] Add missing check for 'isINSERTPSMask' in method 'isShuffleMaskLegal'. This helps the DAGCombiner to identify more opportunities to fold shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221684 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 11:20:31 +00:00
Vasileios Kalintiris	d3da72c5b3	Recommit "[mips] Add names and tests for the hardware registers" The original commit r221299 was reverted in r221307. I removed the name "hrw_ulr" ($29) from the original commit because two tests were failing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221681 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 10:31:31 +00:00
Craig Topper	20d2a260c9	Use uint64_t as the type for the X86 TSFlag format enum. Allows removal of the VEXShift hack that was used to access the higher bits of TSFlags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221673 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 07:32:32 +00:00
Michael Kuperstein	f2fe3b72a9	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Recommitting - This time, with a hopefully working test. Differential Revision: http://reviews.llvm.org/D6128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221672 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 07:07:40 +00:00
Jingyue Wu	7b901e3907	[NVPTX] Remove dead code in NVPTXTargetTransformInfo (NFC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221668 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 05:24:04 +00:00
Rafael Espindola	9272305648	MCAsmParserExtension has a copy of the MCAsmParser. Use it. Base classes were storing a second copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221667 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 05:18:41 +00:00
Quentin Colombet	8201185d61	[X86] Custom lower UINT_TO_FP from v4f32 to v4i32, and for v8f32 to v8i32 if AVX2 is available. According to IACA, the new lowering has a throughput of 8 cycles instead of 13 with the previous one. Althought this lowering kicks in some SPECs benchmarks, the performance improvement was within the noise. Correctness testing has been done for the whole range of uint32_t with the following program: uint4 v = (uint4) {0,1,2,3}; uint32_t i; //Check correctness over entire range for uint4 -> float4 conversion for( i = 0; i < 1U << (32-2); i++ ) { float4 t = test(v); float4 c = correct(v); if( 0xf != _mm_movemask_ps( t == c )) { printf( "Error @ %vx: %vf vs. %vf\n", v, c, t); return -1; } v += 4; } Where "correct" is the old lowering and "test" the new one. The patch adds a test case for the two custom lowering instruction. It also modifies the vector cost model, which is why cast.ll and uitofp.ll are modified. 2009-02-26-MachineLICMBug.ll is also modified because we now hoist 7 instructions instead of 4 (3 more constant loads). rdar://problem/18153096> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221657 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 02:23:47 +00:00
Michael Kuperstein	dee48e7ad4	Reverting r221626 due to a too-strict test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221629 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 21:07:41 +00:00
Juergen Ributzka	1b9706b8c6	[AArch64][FastISel] Fix kill flags for integer extends. In the case we optimize an integer extend away and replace it directly with the source register, we also have to clear all kill flags at all its uses. This is necessary, because the orignal IR instruction might be trivially dead, but we replaced it with a nop at MI level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221628 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 21:05:31 +00:00
Michael Kuperstein	1a66dc7468	[X86] Fix pattern match for 32-to-64-bit zext in the presence of AssertSext This fixes an issue with matching trunc -> assertsext -> zext on x86-64, which would not zero the high 32-bits. See PR20494 for details. Differential Revision: http://reviews.llvm.org/D6128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221626 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 20:40:21 +00:00
Jingyue Wu	26e0544c0a	[NVPTX] Add an NVPTX-specific TargetTransformInfo Summary: It currently only implements hasBranchDivergence, and will be extended in later diffs. Split from D6188. Test Plan: make check-all Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6195 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221619 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 18:38:25 +00:00
Rafael Espindola	d342c4c748	Misc style fixes. NFC. This fixes a few cases of: * Wrong variable name style. * Lines longer than 80 columns. * Repeated names in comments. * clang-format of the above. This make the next patch a lot easier to read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221615 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 18:11:10 +00:00
Zoran Jovanovic	c63c935a80	[mips][microMIPS] Fix issue with delay slot filler and microMIPS Differential Revision: http://reviews.llvm.org/D6193 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221612 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 17:27:56 +00:00
Daniel Sanders	62c2faa216	[mips] Fix sret arguments for N32/N64 which were accidentally broken in r221534. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221604 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-10 15:57:53 +00:00
Matt Arsenault	4d84fe9839	R600: Remove unused define git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221543 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 20:45:00 +00:00
Daniel Sanders	fe2b8b1960	[mips] Promote i32 arguments to i64 for the N32/N64 ABI and fix <64-bit structs... Summary: ... and after all that refactoring, it's possible to distinguish softfloat floating point values from integers so this patch no longer breaks softfloat to do it. Remove direct handling of i32's in the N32/N64 ABI by promoting them to i64. This more closely reflects the ABI documentation and also fixes problems with stack arguments on big-endian targets. We now rely on signext/zeroext annotations (already generated by clang) and the Assert[SZ]ext nodes to avoid the introduction of unnecessary sign/zero extends. It was not possible to convert three tests to use signext/zeroext. These tests are bswap.ll, ctlz-v.ll, ctlz-v.ll. It's not possible to put signext on a vector type so we just accept the sign extends here for now. These tests don't pass the vectors the same way clang does (clang puts multiple elements in the same argument, these map 1 element to 1 argument) so we don't need to worry too much about it. With this patch, all known N32/N64 bugs should be fixed and we now pass the first 10,000 tests generated by ABITest.py. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6117 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221534 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 16:54:21 +00:00
Daniel Sanders	01657f390d	[mips] Removed the remainder of MipsCC. NFC. Summary: One of the calls to AllocateStack (the one in LowerCall) doesn't look like it should be there but it was there before and removing it breaks the frame size calculation. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6116 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221529 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 15:33:08 +00:00
Daniel Sanders	46c098238e	[mips] Remove MipsCC::reservedArgArea() in favour of MipsABIInfo::GetCalleeAllocdArgSizeInBytes(). NFC. Summary: Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6115 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221528 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 15:03:53 +00:00
NAKAMURA Takumi	490119e467	MipsCCState.h: Use LLVM_DELETED_FUNCTION for msc17. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221527 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 14:56:31 +00:00
Daniel Sanders	6b846945ec	[mips] Move MipsCCState to a separate file and clang-formatted it. Summary: Depends on D6113 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6114 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221525 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 14:24:31 +00:00
Daniel Sanders	7fc57f4cef	[mips] Fix unused variable warnings introduced in r221521 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221522 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 12:43:01 +00:00
Daniel Sanders	605a80b218	[mips] Remove remaining use of MipsCC::intArgRegs() in favour of MipsABIInfo::GetByValArgRegs() and MipsABIInfo::GetVarArgRegs() Summary: Depends on D6112 Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6113 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221521 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 12:21:37 +00:00
Daniel Sanders	73bbfc537e	[mips] Remove MipsCC::getRegVT(). NFC Summary: It's no longer used. Reviewers: vmedic, theraven Reviewed By: theraven Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6112 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221519 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 12:02:59 +00:00
Daniel Sanders	b4b823941c	[mips] Remove MipsCC::analyzeCallOperands in favour of CCState::AnalyzeCallOperands. NFC Summary: In addition to the usual f128 workaround, it was also necessary to provide a means of accessing ArgListEntry::IsFixed. Reviewers: theraven, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6111 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221518 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 11:43:49 +00:00
Daniel Sanders	e40a5317d5	[mips] Move SpecialCallingConv to MipsCCState and use it from tablegen-erated code. NFC Summary: In the long run, it should probably become a calling convention in its own right but for now just move it out of MipsISelLowering::analyzeCallOperands() so that we can drop this function in favour of CCState::AnalyzeCallOperands(). Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6085 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221517 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 11:10:48 +00:00
Daniel Sanders	37dbb7411c	[mips] Removed IsVarArg from MipsISelLowering::analyzeCallOperands(). NFC. Summary: CCState objects already carry this information in their isVarArg() method. Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6084 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221516 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 10:45:16 +00:00
Ahmed Bougacha	75da31ff15	[AArch64] Keep flags on condition vreg when instantiating a CB branch. Reversing a CB* instruction used to drop the flags on the condition. On the included testcase, this lead to a read from an undefined vreg. Using addOperand keeps the flags, here <undef>. Differential Revision: http://reviews.llvm.org/D6159 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221507 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-07 02:50:00 +00:00
Simon Pilgrim	de3d50643c	[X86][SSE] Vector integer/float conversion memory folding (cvttps2dq / cvttpd2dq) Fixed an issue with the (v)cvttps2dq and (v)cvttpd2dq instructions being incorrectly put in the 2 source operand folding tables instead of the 1 source operand and added the missing SSE/AVX versions. Also added missing (v)cvtps2dq and (v)cvtpd2dq instructions to the folding tables. Differential Revision: http://reviews.llvm.org/D6001 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221489 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 22:15:41 +00:00
Ahmed Bougacha	112aabeeeb	[X86] Add VFMADDSUB cases for the 213->231 custom inserter. Also add tests for vfmadd/vfmsub. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221488 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 22:04:15 +00:00
Ahmed Bougacha	f44d4cd925	[X86] Add missing FMA3 VFMADDSUB in the emitter. Also reuse the fma4 intrinsic test to cover fma3 instructions too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221487 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 21:58:11 +00:00
Colin LeMahieu	d67fc42d22	[Hexagon] Adding basic Hexagon ELF object emitter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221465 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 17:05:51 +00:00
Eli Bendersky	9b644c8749	Clean up NVPTXLowerStructArgs.cpp. NFC * Remove unnecessary const_casts and C-style casts * Simplify attribute access code * Simplify ArrayRef creation * 80-col and clang-format git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221464 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 17:05:49 +00:00
Daniel Sanders	0d9e067f5a	[mips] Removed IsSoftFloat from MipsISelLowering::analyzeCallOperands(). NFC Summary: It isn't used anymore. Depends on D6081 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6083 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221463 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 16:48:57 +00:00
Daniel Sanders	20e3be7972	[mips] Removed MipsISelLowering::analyzeFormalArguments() in favour of CCState::AnalyzeFormalArguments() Summary: As with returns, we must be able to identify f128 arguments despite them being lowered away. We do this with a pre-analyze step that builds a vector and then we use this vector from the tablegen-erated code. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221461 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 16:36:30 +00:00
Andrea Di Biagio	f0f66a254d	[X86] When commuting SSE immediate blend, make sure that the new blend mask is a valid imm8. Example: define <4 x i32> @test(<4 x i32> %a, <4 x i32> %b) { %shuffle = shufflevector <4 x i32> %a, <4 x i32> %b, <4 x i32> <i32 4, i32 5, i32 6, i32 3> ret <4 x i32> %shuffle } Before llc (-mattr=+sse4.1), produced the following assembly instruction: pblendw $4294967103, %xmm1, %xmm0 After pblendw $63, %xmm1, %xmm0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221455 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:36:45 +00:00
Aaron Ballman	22cfcb2469	Fixing some -Wcast-qual warnings; NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221454 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:32:30 +00:00
Toma Tabacu	7f22a20351	[mips] Tolerate the use of the %z inline asm operand modifier with non-immediates. Summary: Currently, we give an error if %z is used with non-immediates, instead of continuing as if the %z isn't there. For example, you use the %z operand modifier along with the "Jr" constraints ("r" makes the operand a register, and "J" makes it an immediate, but only if its value is 0). In this case, you want the compiler to print "$0" if the inline asm input operand turns out to be an immediate zero and you want it to print the register containing the operand, if it's not. We give an error in the latter case, and we shouldn't (GCC also doesn't). Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221453 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 14:25:42 +00:00
Sasa Stankovic	2b8f96996b	[mips] Add the following MIPS options that control gp-relative addressing of small data items: -mgpopt, -mlocal-sdata, -mextern-sdata. Implement gp-relative addressing for constants. Differential Revision: http://reviews.llvm.org/D4903 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221450 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 13:20:12 +00:00
Toma Tabacu	ea60f51d87	[mips] Improve error/warning messages and testing for the .cpload assembler directive. Summary: Improved warning message when using .cpload inside a reorder section and added an error message for using .cpload with Mips16 enabled. Modified the tests to fit with the changes mentioned above, added a test-case for the N32 ABI in cpload.s and did some reformatting to make the tests easier to read. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5465 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221447 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 10:02:45 +00:00
David Majnemer	a67e1693fc	X86, MC: Tidy up some whitespace in GetRelocType No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221443 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 08:10:37 +00:00
Quentin Colombet	d465eced34	[X86] Lower VSELECT into SHRUNKBLEND when we shrink the bits used into the condition to match a blend. This prevents optimizations that work on VSELECT to perform invalid transformations. Indeed, the optimized condition does not match the vector boolean content that is expected and bad things may happen. This patch yields the exact same code on the whole test-suite + specs (-O3 and -O3 -march=core-avx2), it improves one test case (vector-blend.ll) and fixes a bug reduced in vselect-avx.ll. <rdar://problem/18819506> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221429 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-06 02:25:03 +00:00
Simon Pilgrim	3f1d66fe93	[X86][SSE] Vector integer to float conversion memory folding Added missing memory folding for the (V)CVTDQ2PS instructions - we can safely fold these (but not the (V)CVTDQ2PD versions which have a register/memory size discrepancy in the source operand). I've added a test case demonstrating that stack folding now works. Differential Revision: http://reviews.llvm.org/D5981 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221407 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 22:28:25 +00:00
Matt Arsenault	ecb144e2d1	R600/SI: Fix omod display for VOP3b git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221387 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:35:00 +00:00
Derek Schuff	07ffcb1007	[x86 fast-isel] Materialize allocas with the correct-sized lea for ILP32 Summary: X86FastISel::fastMaterializeAlloca was incorrectly conditioning its opcode selection on subtarget bitness rather than pointer size. Differential Revision: http://reviews.llvm.org/D6136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221386 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:27:21 +00:00
Matt Arsenault	d8586375b8	R600/SI: Move all rsrc building functions to SIISelLowering git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221383 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:01:19 +00:00
Matt Arsenault	12bd9f11c0	R600/SI: Remove SI_ADDR64_RSRC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221382 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 19:01:17 +00:00
Justin Holewinski	e459c0bf65	[NVPTX] Add NVPTXLowerStructArgs pass This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 18:19:30 +00:00
Duncan P. N. Exon Smith	bad06b13ba	IR: MDNode => Value: NamedMDNode::getOperator() Change `NamedMDNode::getOperator()` from returning `MDNode ` to returning `Value `. To reduce boilerplate at some call sites, add a `getOperatorAsMDNode()` for named metadata that's expected to only return `MDNode` -- for now, that's everything, but debug node named metadata (such as llvm.dbg.cu and llvm.dbg.sp) will soon change. This is part of PR21433. Note that there's a follow-up patch to clang for the API change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221375 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 18:16:03 +00:00
Tilmann Scheller	463e275e3e	[ARM] Remove more dead code. Dead code identified by the Clang static analyzer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221372 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:45:04 +00:00
Zoran Jovanovic	cd2d40cef6	ps][microMIPS] Implement CodeGen support for ANDI16 instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:43:00 +00:00
Colin LeMahieu	455618c920	[Hexagon] [NFC] Alphabetizing cmake files. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221370 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:38:48 +00:00
Zoran Jovanovic	a1925e6d5d	ps][microMIPS] Implement CodeGen support for SLL16 and SRL16 instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221369 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:38:31 +00:00
Tilmann Scheller	4929a03d42	[ARM] Remove another redundant assignment. Found by the Clang static analyzer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221368 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:34:04 +00:00
Zoran Jovanovic	8dad1e1e8e	[mips][microMIPS] Implement ANDI16 instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221367 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 17:31:00 +00:00

... 2 3 4 5 6 ...

31568 Commits