llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

Author	SHA1	Message	Date
Jim Grosbach	f4e104f5eb	AArch64: Constant fold converting vector setcc results to float. Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can move unary operations, in this case [su]int_to_fp through the mask operation and constant fold the operation away. Generally speaking: UNARYOP(AND(VECTOR_CMP(x,y), constant)) --> AND(VECTOR_CMP(x,y), constant2) where constant2 is UNARYOP(constant). This implements the transform where UNARYOP is [su]int_to_fp. For example, consider the simple function: define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind { %cmp = fcmp oeq <4 x float> %val, %test %ext = zext <4 x i1> %cmp to <4 x i32> %result = sitofp <4 x i32> %ext to <4 x float> ret <4 x float> %result } Before this change, the code is generated as: fcmeq.4s v0, v0, v1 movi.4s v1, #0x1 // Integer splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. scvtf.4s v0, v0 // Convert each lane to f32. ret After, the code is improved to: fcmeq.4s v0, v0, v1 fmov.4s v1, #1.00000000 // f32 splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. ret The svvtf.4s has been constant folded away and the floating point 1.0f vector lanes are materialized directly via fmov.4s. Rather than do the folding manually in the target code, teach getNode() in the generic SelectionDAG to handle folding constant operands of vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do additional constant folding there as well, but I don't have test cases for those operations, so leaving them for another time when it becomes appropriate. rdar://17693791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 00:40:52 +00:00
Eric Christopher	f0b9d0b3ca	Reset the Subtarget in the AsmPrinter for each machine function and add explanatory comment about dual initialization. Fix use of the Subtarget to grab the information off of the target machine. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213336 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 00:08:53 +00:00
Eric Christopher	a1fa640525	Avoid resetting the UseSoftFloat and FloatABIType on the TargetMachine Options struct and move the comment to inMips16HardFloat. Use the fact that we now know whether or not we cared about soft float to set the libcalls. Accordingly rename mipsSEUsesSoftFloat to abiUsesSoftFloat and propagate since it's no longer CPU specific. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213335 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 00:08:50 +00:00
Nico Weber	c1ef24ce39	ms inline asm: Don't add x86 segment registers to the clobber list. Clang tries to check the clobber list but doesn't list segment registers in its x86 register list. This fixes PR20343. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213303 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 20:24:55 +00:00
Arnaud A. de Grandmaison	08f689e9b0	[AArch64] Cleanup AsmParser: no need to use dyn_cast + assert. cast does it for us. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213296 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 19:08:14 +00:00
Justin Holewinski	11ae250ec9	[NVPTX] Improve handling of FP fusion We now consider the FPOpFusion flag when determining whether to fuse ops. We also explicitly emit add.rn when fusion is disabled to prevent ptxas from fusing the operations on its own. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213287 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 18:10:09 +00:00
Matt Arsenault	15865afcf7	Fix typos git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213285 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 17:50:22 +00:00
Adam Nemet	6ae2941874	[X86] AVX512: Add disassembler support for compressed displacement There are two parts here. First is to modify tablegen to adjust the encoding type ENCODING_RM with the scaling factor. The second is to use the new encoding types to compute the correct displacement in the decoder. Fixes <rdar://problem/17608489> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213281 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 17:04:56 +00:00
Adam Nemet	f166cc13e2	[X86] AVX512: Rename EVEX_CD8V to CD8_Form This is to match the naming of CD8_EltSize, CD8_Scale, etc. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213280 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 17:04:52 +00:00
Adam Nemet	5c6e7de1ad	[X86] AVX512: Use the TD version of CD8_Scale in the assembler Passes the computed scaling factor in TSFlags rather than the old attributes. Also removes the C++ version of computing the scaling factor (MemObjSize) along with the asserts added by the previous patch. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213279 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 17:04:50 +00:00
Adam Nemet	ccba8025da	[X86] AVX512: Move compressed displacement logic to TD This does not actually move the logic yet but reimplements it in the Tablegen language. Then asserts that the new implementation results in the same value. The next patch will remove the assert and the temporary use of the TSFlags and remove the C++ implementation. The formula requires a limited form of the logical left and right operators. I implemented these with the bit-extract/insert operator (i.e. blah{bits}). No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213278 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 17:04:34 +00:00
Justin Holewinski	07bc0b6ae6	[NVPTX] Add missing .v4 qualifier on vector store instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213276 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 16:58:56 +00:00
Saleem Abdulrasool	a056166dc2	MC: fix MCAsmInfo usage for windows-itanium Windows itanium uses the GNUCOFF assmebly format, not ELF. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213274 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 16:27:40 +00:00
Justin Holewinski	b26ede07ad	[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery Also, add some tests to make sure we can handle surface/texture queries on both Fermi and Kepler+. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213268 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 14:51:33 +00:00
Justin Holewinski	d6663f565c	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch This also uses TSFlags to mark machine instructions that are surface/texture accesses, as well as the vector width for surface operations. This is used to simplify some of the switch statements that need to detect surface/texture instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 11:59:04 +00:00
Tim Northover	58589cefee	ARM: support direct f16 <-> f64 conversions ARMv8 has instructions to handle it, otherwise a libcall is needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213254 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 11:27:04 +00:00
Tim Northover	6c701b9aca	CodeGen: generate single libcall for fptrunc -> f16 operations. Previously we asserted on this code. Currently compiler-rt doesn't actually implement any of these new libcalls, but external help is pretty much the only viable option for LLVM. I've followed the much more generic "__truncST2" naming, as opposed to the odd name for f32 -> f16 truncation. This can obviously be changed later, or overridden by any targets that need to. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213252 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 11:12:12 +00:00
Tim Northover	ed05086d61	X86: support double extension of f16 type. x86 has no native ability to extend an f16 to f64, but the same result is obtained if we expand it into two separate extensions: f16 -> f32 -> f64. Unfortunately the same is not true for truncate, so that still results in a compilation failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213251 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 11:04:04 +00:00
Tim Northover	3e61ccdded	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:51:23 +00:00
Yi Kong	f33a30cdd0	Port memory barriers intrinsics to AArch64 Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to implement their corresponding ACLE and MSVC intrinsics. This patch ports ARM dmb, dsb, isb intrinsic to AArch64. Differential Revision: http://reviews.llvm.org/D4520 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:50:20 +00:00
Daniel Sanders	8f69856b52	[mips] .reginfo is 8 byte aligned on N32. Differential Revision: http://reviews.llvm.org/D4540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213246 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:10:04 +00:00
Daniel Sanders	3da6527068	[mips] Correct ELF e_flags for the N32 ABI when using a mips-* triple rather than a mips64-* triple Summary: Generally speaking, mips-* vs mips64-* should not be used to make decisions about the content or format of the ELF. This should be based on the ABI and CPU in use. For example, `mips-linux-gnu-clang -mips64r2 -mabi=64` should produce an ELF64 as should `mips64-linux-gnu-clang -mabi=64`. Conversely, `mips64-linux-gnu-clang -mabi=n32` should produce an ELF32 as should `mips-linux-gnu-clang -mips64r2 -mabi=n32`. This patch fixes the e_flags but leaves the ELF32 vs ELF64 issue for now since there is no apparent way to base this decision on the ABI and CPU. Differential Revision: http://reviews.llvm.org/D4539 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213244 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:02:08 +00:00
Daniel Sanders	0a342e9992	[mips] Correct .MIPS.abiflags for -mfpxx on MIPS32r6 Summary: The cpr1_size field describes the minimum register width to run the program rather than the size of the registers on the target. MIPS32r6 was acting as if -mfp64 has been given because it starts off with 64-bit FPU registers. Differential Revision: http://reviews.llvm.org/D4538 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213243 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 09:57:23 +00:00
Daniel Sanders	4b5e7542df	[mips] Fix ELF e_flags related to -mabicalls and -mplt. Summary: These options are not implemented yet but we act as if they are always given. The integrated assembler is driven by the clang driver so the e_flag test cases should match the e_flags emitted by GCC+GAS rather than GAS by itself. Differential Revision: http://reviews.llvm.org/D4536 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213242 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 09:52:56 +00:00
Matt Arsenault	21a9a6658e	Use range for git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213230 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 06:19:06 +00:00
Matt Arsenault	df4663be6a	R600: Short circuit alloca check if address space isn't private. Skip calling GetUnderlyingObject in cases where it obviously isn't from an alloca. This should only be a compile time improvement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213229 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 06:13:41 +00:00
Sanjay Patel	8758b729f5	Remove Atom references in description. Any CPU can run this pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213190 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 20:18:49 +00:00
Chris Bieneman	ec7a144603	[RegisterCoalescer] Moving the RegisterCoalescer subtarget hook onto the TargetRegisterInfo instead of the TargetSubtargetInfo. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213188 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 20:13:31 +00:00
Justin Holewinski	a1535e3b9b	[NVPTX] Honor alignment on vector loads/stores We were not considering the stated alignment on vector loads/stores, leading us to generate vector instructions even when we do not have sufficient alignment. Now, for IR like: %1 = load <4 x float>, <4 x float>* %ptr, align 4 we will generate correct, conservative PTX like: ld.f32 ... [%ptr] ld.f32 ... [%ptr+4] ld.f32 ... [%ptr+8] ld.f32 ... [%ptr+12] Or if we have an alignment of 8 (for example), we can generate code like: ld.v2.f32 ... [%ptr] ld.v2.f32 ... [%ptr+8] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213186 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 19:45:35 +00:00
Chris Bieneman	4722b28a3e	Added documentation for SizeMultiplier in the ARM subtarget hook for register coalescing. Also fixed some 80 col violations. No functional code changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213169 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 16:27:31 +00:00
Justin Holewinski	7e6565112b	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd This matches the internal behavior of NVIDIA tools like libnvvm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213168 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 16:26:58 +00:00
Tim Northover	1cafa00e26	CodeGen: don't form illegail EXTLOAD operations. It turns out that in most cases (the main exception being i1-related types) once these operations are formed we cannot separate them and the targets end up having to deal with them whether they want to or not. This is not a good situation, and a more reasonable default can be formed by ackowledging this and having targets leave them as Legal. Only x86 seems to be affected (other targets don't even try marking the operation Expand). Mostly there's no visible change here yet, but it will be useful to have truly expanded EXTLOADS for MVT::f16 softening support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213162 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 15:37:24 +00:00
Daniel Sanders	663e484df9	[mips][fp64a] Temporarily disable odd-numbered double-precision registers when using the FP64A ABI. Summary: A few instructions (mostly cvt.d.w and similar) are causing problems with -mfp64 and -mno-odd-spreg and it looks like fixing it properly may take several weeks. In the meantime, let's disable the odd-numbered double-precision registers so that the generated code is at least valid. The problem is that instructions like cvt.d.w read from the 32-bit low subregister of a double-precision FPU register. This often leads to the compiler to inserting moves to transfer a GPR32 to a FGR32 using mtc1. Such moves violate the rules against 32-bit writes to odd-numbered FPU registers imposed by -mno-odd-spreg. By disabling the odd-numbered double-precision registers, it becomes impossible for the 32-bit low subregister to be odd-numbered. This fixes numerous test-suite failures when compiling for the FP64A ABI ('-mfp64 -mno-odd-spreg'). There is no LLVM test case because it's difficult to test that odd-numbered FPU registers are not allocatable. Instead, we depend on the assembler (GAS and -fintegrated-as) raising errors when the rules are violated. Differential Revision: http://reviews.llvm.org/D4532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213160 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 15:34:07 +00:00
Andrea Di Biagio	35f6e97777	[X86] Add a check for 'isMOVHLPSMask' within method 'isShuffleMaskLegal'. Before this change, method 'isShuffleMaskLegal' didn't know that shuffles implementing a 'movhlps' operation were perfectly legal for SSE targets. This patch adds the missing check for 'isMOVHLPSMask' inside method 'isShuffleMaskLegal' to fix the problem. The reason why it is important to do this is because the DAGCombiner conservatively avoids combining a pair of shuffles if the resulting shuffle node has an illegal mask. Before this patch, shuffles with a MOVHLPS mask were wrongly considered not to be legal. This was the root cause of some poor-code generation bugs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213137 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 11:29:39 +00:00
Matt Arsenault	a27bf373d9	R600/SI: Allow using f32 rcp / rsq when denormals not handled. These are precise enough to use for OpenCL unless denormals are handled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213107 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 23:50:10 +00:00
David Majnemer	6f7532bb97	X86: Simplify X86WindowsTargetObjectFile::getSectionForConstant There exists a helper function to abstract away the various differences between ConstantVector, ConstantDataVector, ConstantAggregateZero, etc. Use it to simplify X86WindowsTargetObjectFile::getSectionForConstant. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213104 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 23:01:10 +00:00
Sanjay Patel	f7e042324a	Move Post RA Scheduling flag bit into SchedMachineModel Refactoring; no functional changes intended Removed PostRAScheduler bits from subtargets (X86, ARM). Added PostRAScheduler bit to MCSchedModel class. This bit is set by a CPU's scheduling model (if it exists). Removed enablePostRAScheduler() function from TargetSubtargetInfo and subclasses. Fixed the existing enablePostMachineScheduler() method to use the MCSchedModel (was just returning false!). Added methods to TargetSubtargetInfo to allow overrides for AntiDepBreakMode, CriticalPathRCs, and OptLevel for PostRAScheduling. Added enablePostRAScheduler() function to PostRAScheduler class which queries the subtarget for the above values. Preserved existing scheduler behavior for ARM, MIPS, PPC, and X86: a. ARM overrides the CPU's postRA settings by enabling postRA for any non-Thumb or Thumb2 subtarget. b. MIPS overrides the CPU's postRA settings by enabling postRA for everything. c. PPC overrides the CPU's postRA settings by enabling postRA for everything. d. X86 is the only target that actually has postRA specified via sched model info. Differential Revision: http://reviews.llvm.org/D4217 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213101 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 22:39:58 +00:00
Matt Arsenault	7929c13df0	R600/SI: Fix select on i1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213096 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 21:44:37 +00:00
Matt Arsenault	4b0a7f3946	R600/SI: Implement less wrong f32 fdiv Assuming single precision denormals and accurate sqrt/div are not reported, this passes the OpenCL conformance test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213089 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 20:18:31 +00:00
Matt Arsenault	e4c6241505	R600: Add predicate for UnsafeFPMath git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213088 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 20:18:24 +00:00
Matt Arsenault	fc7ff2ac50	R600: Remove intrinsics that appear to be unused git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213087 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 20:10:27 +00:00
Chris Bieneman	01d8611240	[RegisterCoalescer] Add new subtarget hook allowing targets to opt-out of coalescing. The coalescer is very aggressive at propagating constraints on the register classes, and the register allocator doesn’t know how to split sub-registers later to recover. This patch provides an escape valve for targets that encounter this problem to limit coalescing. This patch also implements such for ARM to lower register pressure when using lots of large register classes. This works around PR18825. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213078 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 17:18:41 +00:00
Cameron McInally	ef886879b6	Revert r213070. It's breaking the build in MCELFStreamer::EmitInstToData(...). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213073 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 16:24:24 +00:00
Jan Vesely	7b5f8a525a	R600: Implement zero undef variants of ctlz/cttz v2: use ffbh/l if available v3: Rebase on top of Matt's SI patches Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 15:51:09 +00:00
Daniel Sanders	6d6a055c60	[mips] Correct .MIPS.abiflags fp_abi field for -mfpxx and without .module Summary: Previously all the test cases set it after initialization with '.module fp=xx'. Differential Revision: http://reviews.llvm.org/D4489 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213071 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 15:31:39 +00:00
Cameron McInally	160c4cb678	Add x86 patterns to match a specific add-with-carry. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213070 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 15:03:32 +00:00
NAKAMURA Takumi	3ee5fc8618	Prune Redundant libdeps in CMake's target_link_libraries and LLVMBuild.txt. I checked this with Release+Asserts on x86_64-mingw32. Please restore partially if this were overkill. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213064 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 11:37:03 +00:00
Andrea Di Biagio	a34d957b61	Silence a warning in conditional expression. Fixes a gcc warning caused by a typo. A redundant assignment operation was accidentally used as the third operand of a conditional expression. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213061 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 10:53:44 +00:00
Tim Northover	fbb631183a	AArch64: fall back to generic code for out of range extract/insert. rdar://problem/17624784 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213059 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 10:00:26 +00:00
David Majnemer	218f127b63	Fix typo in comment No functionality changed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213052 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 07:11:32 +00:00
Juergen Ributzka	cce1bd7a0b	[FastISel][X86] Remove no longer needed functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213051 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 06:35:53 +00:00
Juergen Ributzka	1b0266d7cb	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213050 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 06:35:50 +00:00
Juergen Ributzka	566afe2a15	[FastISel][X86] Implement the FastLowerCall hook. This implements the FastLowerCall hook, which is based on the DoSelectCall function. The implementation is very similar, but the target-independent call lowering part has been factored out. This should also enable patchpoint intrinsic lowering for FastISel on X86. Related to <rdar://problem/17427052>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213049 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 06:35:47 +00:00
Juergen Ributzka	a107ebd54d	Revert "[FastISel][X86] Remove no longer needed functions." Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook." Revert "[FastISel][X86] Implement the FastLowerCall hook." This reverts commit r213035, r213036, and r213037 to make the buildbots happy again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213048 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 05:23:40 +00:00
David Majnemer	cdc1044944	CodeGen: Handle ConstantVector and undef in WinCOFF constant pools The constant pool entry code for WinCOFF assumed that vector constants would be formed using ConstantDataVector, it did not expect to see a ConstantVector. Furthermore, it did not expect undef as one of the elements of the vector. ConstantVectors should be handled like ConstantDataVectors, treat Undef as zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213038 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 02:34:12 +00:00
Juergen Ributzka	3c0737454d	[FastISel][X86] Remove no longer needed functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 02:22:56 +00:00
Juergen Ributzka	a7d1d3a513	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213036 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 02:22:53 +00:00
Juergen Ributzka	805486c3f9	[FastISel][X86] Implement the FastLowerCall hook. This implements the FastLowerCall hook, which is based on the DoSelectCall function. The implementation is very similar, but the target-independent call lowering part has been factored out. This should also enable patchpoint intrinsic lowering for FastISel on X86. Related to <rdar://problem/17427052>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213035 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 02:22:49 +00:00
Matt Arsenault	5fbf09a69f	R600: Add dag combine for copy of an illegal type. This helps avoid redundant instructions to unpack, and repack the vectors. Ideally we could recognize that pattern and eliminate it. Currently v4i8 and other small element type vectors are scalarized, so this has the added bonus of avoiding that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213031 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 02:06:31 +00:00
Matt Arsenault	b7df516d1f	R600: Add denormal handling subtarget features. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213018 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 23:40:49 +00:00
Matt Arsenault	5b70c8ac7e	R600/SI: Default to no single precision denormals. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 23:40:43 +00:00
Adam Nemet	3afd71fc7d	[X86] Specify all TSFlags bit-offsets symbolically No functional change. The offsets for the other bitfields are specified symbolically. I need to increase the size for one of the earlier fields which is easier after this cleanup. Why these bits are relative to VEXShift is a bit strange but that is for another cleanup. I made sure that the values for the enums are unchanged after this change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213011 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 23:18:39 +00:00
David Majnemer	38d8be1ad8	CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF COFF lacks a feature that other object file formats support: mergeable sections. To work around this, MSVC sticks constant pool entries in special COMDAT sections so that each constant is in it's own section. This permits unused constants to be dropped and it also allows duplicate constants in different translation units to get merged together. This fixes PR20262. Differential Revision: http://reviews.llvm.org/D4482 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213006 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 22:57:27 +00:00
Saleem Abdulrasool	5335b49f96	X86: correct 64-bit atomics on 32-bit We would emit a libcall for a 64-bit atomic on x86 after SVN r212119. This was due to the misuse of hasCmpxchg16 to indicate if cmpxchg8b was supported on a 32-bit target. They were added at different times and would result in the border condition being mishandled. This fixes the border case to emit the cmpxchg8b instruction for 64-bit atomic operations on x86 at the cost of restoring a long-standing bug in the codegen. We emit a cmpxchg8b on all x86 targets even where the CPU does not support this instruction (pre-Pentium CPUs). Although this bug should be fixed, this was present prior to SVN r212119 and this change, so this is not really introducing a regression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212956 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 16:28:13 +00:00
Tim Northover	9e251d6d41	X86: remove temporary atomicrmw used during lowering. We construct a temporary "atomicrmw xchg" instruction when lowering atomic stores for widths that aren't supported natively. This isn't on the top-level worklist though, so it won't be removed automatically and we have to do it ourselves once that itself has been lowered. Thanks Saleem for pointing this out! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212948 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 15:31:13 +00:00
Daniel Sanders	52a51e197f	Re-commit: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags The lld tests will temporarily fail again but Simon Atanasyan will commit a fix for those shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212946 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 15:05:51 +00:00
Daniel Sanders	b70b4892a4	Revert: [mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags This commit causes multiple lld tests to fail. Reverting while I investigate the issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212945 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 14:43:45 +00:00
Daniel Sanders	05aee6b5f4	[mips] Correct section alignments and EntrySizes for .bss, .text, .data, .reginfo, .MIPS.options, and .MIPS.abiflags Summary: .bss, .text, and .data are at least 16-byte aligned. .reginfo is 4-byte aligned and has a 24-byte EntrySize. .MIPS.abiflags has an 24-byte EntrySize. .MIPS.options is 8-byte aligned and has 1-byte EntrySize. Using a 1-byte EntrySize for .MIPS.options seems strange because the records are neither 1-byte long nor fixed-length but this matches the value that GAS emits. Differential Revision: http://reviews.llvm.org/D4487 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212939 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 14:02:14 +00:00
Daniel Sanders	c4ce78e261	[mips] For the FP64A ABI, odd-numbered double-precision moves must not use mtc1/mfc1. Summary: This is because the FP64A the hardware will redirect 32-bit reads/writes from/to odd-numbered registers to the upper 32-bits of the corresponding even register. In effect, simulating FR=0 mode when FR=0 mode is not available. Unfortunately, we have to make the decision to avoid mfc1/mtc1 before register allocation so we currently do this for even registers too. FPXX has a similar requirement on 32-bit architectures that lack mfhc1/mthc1 so this patch also handles the affected moves from the FPU for FPXX too. Moves to the FPU were supported by an earlier commit. Differential Revision: http://reviews.llvm.org/D4484 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212938 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 13:08:14 +00:00
Daniel Sanders	543f70b040	[mips] Use MFHC1 when it is available (MIPS32r2 and later) for both FP32 and FP64 moves Summary: This is similar to r210771 which did the same thing for MTHC1. Also corrected MTHC1_D32 and MTHC1_D64 which used AFGR64 and FGR64 on the wrong definitions. Differential Revision: http://reviews.llvm.org/D4483 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212936 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 12:41:31 +00:00
Tim Northover	26012cec89	AArch64: remove unnecessary pseudo-instruction. Sufficiently twisted use of TableGen lets us write patterns directly for f16 (as an i16 promoted to i32) -> f32 conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212933 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 11:16:02 +00:00
Daniel Sanders	e99ebb9b2f	[mips] Correct the AFL_FLAGS1_ODDSPREG flag in .MIPS.abiflags when no '.module oddspreg' is used Differential Revision: http://reviews.llvm.org/D4486 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212932 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 10:26:15 +00:00
Sasa Stankovic	fce699d88a	[mips] Expand BuildPairF64 to a spill and reload when the O32 FPXX ABI is enabled and mthc1 and dmtc1 are not available (e.g. on MIPS32r1) This prevents the upper 32-bits of a double precision value from being moved to the FPU with mtc1 to an odd-numbered FPU register. This is necessary to ensure that the code generated executes correctly regardless of the current FPU mode. MIPS32r2 and above continues to use mtc1/mthc1, while MIPS-IV and above continue to use dmtc1. Differential Revision: http://reviews.llvm.org/D4465 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212930 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 09:40:29 +00:00
NAKAMURA Takumi	95d3cfc51d	NVPTX/LLVMBuild.txt: Add "Scalar" to required_libraries. It is really referenced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212918 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 02:52:19 +00:00
Saleem Abdulrasool	1f1e6e7884	MC: make DWARF and Windows unwinding handling more similar Rename member variables and functions for the MCStreamer for DWARF-like unwinding management. Rename the Windows ones as well and make the naming and handling similar across the two. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212912 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-13 19:03:36 +00:00
Matt Arsenault	0ed20177c2	Remove unused include git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212898 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-13 03:08:59 +00:00
Matt Arsenault	74c9fe26e6	R600: Use range for and fix missing consts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212897 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-13 03:06:43 +00:00
Matt Arsenault	bb098a4d87	R600: Make ShaderType private git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212896 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-13 03:06:39 +00:00
Matt Arsenault	8e53751320	R600: Add option to disable promote alloca This can make writing some tests harder, so add a flag to disable it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212893 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-13 02:08:26 +00:00
Saleem Abdulrasool	01c06d7954	AArch64: add support for llvm.aarch64.hint intrinsic This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to support the various hint intrinsic functions in the ACLE. Add an optional pattern field that permits the subclass to specify the pattern that matches the selection. The intrinsic pattern is set as mayLoad, mayStore, so overload the value for the definition of the hint instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212883 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-12 21:20:49 +00:00
Juergen Ributzka	b2e9db4109	Revert "[FastISel][X86] Implement the FastLowerIntrinsicCall hook." This reverts commit r212851, because it broke the memset lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212855 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 23:10:08 +00:00
Juergen Ributzka	83772afe0e	[FastISel][X86] Implement the FastLowerIntrinsicCall hook. Rename X86VisitIntrinsicCall -> FastLowerIntrinsicCall, which effectively implements the target hook. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212851 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 22:37:43 +00:00
Ulrich Weigand	edb27188b4	[PowerPC] Fix invalid displacement created by LocalStackAlloc This commit fixes a bug in PPCRegisterInfo::isFrameOffsetLegal that could result in the LocalStackAlloc pass creating an MI instruction out-of-range displacement: %vreg17<def> = LD 33184, %vreg31; mem:LD8[%g](align=32) %G8RC:%vreg17 G8RC_and_G8RC_NOX0:%vreg31 (In final assembler output the top bits are stripped off, resulting in a negative offset loading from below the stack pointer.) Common code expects the isFrameOffsetLegal routine to verify whether adding a given offset to the offset already present in the instruction results in a valid displacement. However, on PowerPC the routine did not take the already present instruction offset into account. This commit fixes isFrameOffsetLegal to add the instruction offset, and updates a local caller (needsFrameBaseReg) to no longer add the instruction offset itself before calling isFrameOffsetLegal. Reviewed by Hal Finkel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212832 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 17:19:31 +00:00
Marek Olsak	438e1f2ad8	R600/SI: Use i32 vectors for resources and samplers This affects new intrinsics only. What surprises me is that v32i8 still works. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212831 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 17:11:52 +00:00
Marek Olsak	5a35fdcafb	R600/SI: add sample and image intrinsics exposing all instruction fields We need the intrinsics with offsets, so why not just add them all. The R128 parameter will also be useful for reducing SGPR usage. GL_ARB_image_load_store also adds some image GLSL modifiers like "coherent", so Mesa will probably translate those to slc, glc, etc. When LLVM 3.5 is released, I'll switch Mesa to these new intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212830 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 17:11:46 +00:00
Marek Olsak	4071d6f58b	R600/SI: fix shadow mapping for 1D and 2D array textures It was conflicting with def TEX_SHADOW_ARRAY, which also handles them. Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212829 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 17:11:39 +00:00
Oliver Stannard	cb047f2a74	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212812 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 13:33:46 +00:00
Quentin Colombet	52c50f5ec5	[X86] Fix the inversion of low and high bits for the lowering of MUL_LOHI. Also add a few comments. <rdar://problem/17581756> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212808 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 12:08:23 +00:00
Adam Nemet	11c9a50611	[X86] AVX512: Improve readability of isCDisp8 No functional change. As I was trying to understand this function, I found that variables were reused with confusing names and the broadcast case was a bit too implicit. Hopefully, this is an improvement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212795 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 05:23:25 +00:00
Adam Nemet	fcc56eb31f	[X86] AVX512: Simplify logic in isCDisp8 It was computing the VL/n case as: MemObjSize = VectorByteSize / ElemByteSize / Divider * ElemByteSize ElemByteSize not only falls out but VectorByteSize/Divider now actually matches the definition of VL/n. Also some formatting fixes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212794 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 05:23:12 +00:00
Jan Vesely	865527a09b	R600: Implement float to long/ulong Use alg. from LegalizeDAG.cpp Move Expand setting to SIISellowering v2: Extend existing tests instead of creating new ones v3: use separate LowerFPTOSINT function v4: use TargetLowering::expandFP_TO_SINT add comment about using FP_TO_SINT for uints Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 22:40:21 +00:00
Brad Smith	e39f7f95bc	Use the integrated assembler by default on OpenBSD. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212771 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 22:37:28 +00:00
Zoran Jovanovic	fc6a659676	[mips] Emit two CFI offset directives per double precision SDC1/LDC1 instead of just one for FR=1 registers Differential Revision: http://reviews.llvm.org/D4310 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212769 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 22:23:30 +00:00
Akira Hatanaka	60079c18bc	[X86] Mark pseudo instruction TEST8ri_NOEREX as hasSIdeEffects=0. Also, add a case clause in X86InstrInfo::shouldScheduleAdjacent to enable macro-fusion. <rdar://problem/15680770> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212747 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 18:00:53 +00:00
Eric Christopher	282eda9b74	Make it possible for the Subtarget to change between function passes in the mips back end. This, unfortunately, required a bit of churn in the various predicates to use a pointer rather than a reference. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212744 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 17:26:51 +00:00
David Majnemer	9478f0a6ff	Mips: Silence a -Wcovered-switch-default Remove a default label which covered no enumerators, replace it with a llvm_unreachable. No functionality changed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 16:04:04 +00:00
Zoran Jovanovic	0ef47114d3	[mips] Added FPXX modeless calling convention. Differential Revision: http://reviews.llvm.org/D4293 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212726 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 15:36:12 +00:00
Arnaud A. de Grandmaison	a9af0558b2	[AArch64] Add logical alias instructions to MC AsmParser This patch teaches the AsmParser to accept some logical+immediate instructions and convert them as shown: bic Rd, Rn, #imm -> and Rd, Rn, #~imm bics Rd, Rn, #imm -> ands Rd, Rn, #~imm orn Rd, Rn, #imm -> orr Rd, Rn, #~imm eon Rd, Rn, #imm -> eor Rd, Rn, #~imm Those instructions are an alternate syntax available to assembly coders, and are needed in order to support code already compiling with some other assemblers. For example, the bic construct is used by the linux kernel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212722 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 15:12:26 +00:00
Tim Northover	6b0ac2aa02	AArch64: correctly fast-isel i8 & i16 multiplies We were asking for a register for type i8 or i16 which caused an assert. rdar://problem/17620015 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212718 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 14:18:46 +00:00
Daniel Sanders	24a071b8c5	[mips] Add support for -modd-spreg/-mno-odd-spreg Summary: When -mno-odd-spreg is in effect, 32-bit floating point values are not permitted in odd FPU registers. The option also prohibits 32-bit and 64-bit floating point comparison results from being written to odd registers. This option has three purposes: * It allows support for certain MIPS implementations such as loongson-3a that do not allow the use of odd registers for single precision arithmetic. * When using -mfpxx, -mno-odd-spreg is the default and this allows us to statically check that code is compliant with the O32 FPXX ABI since mtc1/mfc1 instructions to/from odd registers are guaranteed not to appear for any reason. Once this has been established, the user can then re-enable -modd-spreg to regain the use of all 32 single-precision registers. * When using -mfp64 and -mno-odd-spreg together, an O32 extension named O32 FP64A is used as the ABI. This is intended to provide almost all functionality of an FR=1 processor but can also be executed on a FR=0 core with the assistance of a hardware compatibility mode which emulates FR=0 behaviour on an FR=1 processor. * Added '.module oddspreg' and '.module nooddspreg' each of which update the .MIPS.abiflags section appropriately * Moved setFpABI() call inside emitDirectiveModuleFP() so that the caller doesn't have to remember to do it. * MipsABIFlags now calculates the flags1 and flags2 member on demand rather than trying to maintain them in the same format they will be emitted in. There is one portion of the -mfp64 and -mno-odd-spreg combination that is not implemented yet. Moves to/from odd-numbered double-precision registers must not use mtc1. I will fix this in a follow-up. Differential Revision: http://reviews.llvm.org/D4383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212717 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 13:38:23 +00:00
Zinovy Nis	a2bc403710	[x32] Add AsmBackend for X32 which uses ELF32 with x86_64 (the author is Pavel Chupin). This is minimal change for backend required to have "hello world" compiled and working on x32 target (x86_64-linux-gnux32). More patches for x32 will follow. Differential Revision: http://reviews.llvm.org/D4181 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212716 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 13:03:26 +00:00
Richard Sandiford	35dda8a53c	[SystemZ] Use SystemZCallingConv.td to define callee-saved registers Just a clean-up. No behavioral change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212711 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 11:44:37 +00:00
Richard Sandiford	446067b143	[SystemZ] Tweak instruction format classifications There's no real need to have Shift as a separate format type from Binary. The comments for other format types were too specific and in some cases no longer accurate. Just a clean-up, no behavioral change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212707 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 11:29:23 +00:00
Chandler Carruth	a4e7f05a28	[x86] Add another combine that is particularly useful for the new vector shuffle lowering: match shuffle patterns equivalent to an unpcklwd or unpckhwd instruction. This allows us to use generic lowering code for v8i16 shuffles and match the unpack pattern late. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212705 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 11:09:29 +00:00
Richard Sandiford	9aa8beb9d6	[SystemZ] Add MC support for LEDBRA, LEXBRA and LDXBRA These instructions aren't used for codegen since the original L*DB instructions are suitable for fround. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212703 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 11:00:55 +00:00
Richard Sandiford	beefa3ada0	[SystemZ] Avoid using i8 constants for immediate fields Immediate fields that have no natural MVT type tended to use i8 if the field was small enough. This was a bit confusing since i8 isn't a legal type for the target. Fields for short immediates in a 32-bit or 64-bit operation use i32 or i64 instead, so it would be better to do the same for all fields. No behavioral change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212702 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 10:52:51 +00:00
Richard Sandiford	b350ec7ec6	[SystemZ] Fix FPR dwarf numbering The dwarf FPR numbers are supposed to have the order F0, F2, F4, F6, F1, F3, F5, F7, F8, etc., which matches the pairing of registers for long doubles. E.g. a long double stored in F0 is paired with F2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212701 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 10:45:11 +00:00
Daniel Sanders	b0b3161567	Make it possible for ints/floats to return different values from getBooleanContents() Summary: On MIPS32r6/MIPS64r6, floating point comparisons return 0 or -1 but integer comparisons return 0 or 1. Updated the various uses of getBooleanContents. Two simplifications had to be disabled when float and int boolean contents differ: - ScalarizeVecRes_VSELECT except when the kind of boolean contents is trivially discoverable (i.e. when the condition of the VSELECT is a SETCC node). - visitVSELECT (select C, 0, 1) -> (xor C, 1). Come to think of it, this one could test for the common case of 'C' being a SETCC too. Preserved existing behaviour for all other targets and updated the affected MIPS32r6/MIPS64r6 tests. This also fixes the pi benchmark where the 'low' variable was counting in the wrong direction because it thought it could simply add the result of the comparison. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, jholewinski, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4389 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212697 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 10:18:12 +00:00
Chandler Carruth	977aab501d	[x86] Expand the target DAG combining for PSHUFD nodes to be able to combine into half-shuffles through unpack instructions that expand the half to a whole vector without messing with the dword lanes. This fixes some redundant instructions in splat-like lowerings for v16i8, which are now getting to be really nice. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212695 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 09:57:36 +00:00
Chandler Carruth	dc90a3ab8f	[x86] Tweak the v16i8 single input special case lowering for shuffles that splat i8s into i16s. Previously, we would try much too hard to arrange a sequence of i8s in one half of the input such that we could unpack them into i16s and shuffle those into place. This isn't always going to be a cheaper i8 shuffle than our other strategies. The case where it is always going to be cheaper is when we can arrange all the necessary inputs into one half using just i16 shuffles. It happens that viewing the problem this way also makes it much easier to produce an efficient set of shuffles to move the inputs into one half and then unpack them. With this, our splat code gets one step closer to being not terrible with the new experimental lowering strategy. It also exposes two combines missing which I will add next. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212692 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 09:16:40 +00:00
Chandler Carruth	95b14b00da	[x86] Initial improvements to the new shuffle lowering for v16i8 shuffles specifically for cases where a small subset of the elements in the input vector are actually used. This is specifically targetted at improving the shuffles generated for trunc operations, but also helps out splat-like operations. There is still some really low-hanging fruit here that I want to address but this is a huge step in the right direction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212680 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 04:34:06 +00:00
Matt Arsenault	425ef825a6	R600/SI: Add support for llvm.convert.{to\|from}.fp16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212676 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 03:22:20 +00:00
Chandler Carruth	c4788790f6	[x86] Refactor some of the new code for lowering v16i8 shuffles to remove duplication and make it easier to select different strategies. No functionality changed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212674 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 02:24:26 +00:00
Chandler Carruth	515f228ec3	[SDAG] Make the new zext-vector-inreg node default to expand so targets don't need to set it manually. This is based on feedback from Tom who pointed out that if every target needs to handle this we need to reach out to those maintainers. In fact, it doesn't make sense to duplicate everything when anything other than expand seems unlikely at this stage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212661 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 22:53:04 +00:00
Jim Grosbach	a3edd6a038	AArch64: Better codegen for storing to __fp16. Storing will generally be immediately preceded by rounding from an f32 or f64, so make sure to match those patterns directly to convert into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-from-f64 path which was first converting to f32 and then to f16 from there. rdar://17594379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212638 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 18:55:52 +00:00
Benjamin Kramer	08b75dd061	TargetRegisterInfo: Remove function that fell out of use years ago. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212636 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 18:53:57 +00:00
Adam Nemet	074b752cc9	[X86] AVX512: Enable it in the Loop Vectorizer This lets us experiment with 512-bit vectorization without passing force-vector-width manually. The code generated for a simple integer memset loop is properly vectorized. Disassembly is still broken for it though :(. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 18:22:33 +00:00
Louis Gerbarg	479af7ce0c	Make AArch64FastISel::EmitIntExt explicitly check its source and destination types This is a follow up to r212492. There should be no functional difference, but this patch makes it clear that SrcVT must be an i1/i8/16/i32 and DestVT must be an i8/i16/i32/i64. rdar://17516686 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212633 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 17:54:32 +00:00
Benjamin Kramer	4afbd3e941	X86: When lowering v8i32 himuls use the correct shuffle masks for AVX2. Turns out my trick of using the same masks for SSE4.1 and AVX2 didn't work out as we have to blend two vectors. While there remove unecessary cross-lane moves from the shuffles so the backend can lower it to palignr instead of vperm. Fixes PR20118, a miscompilation of vector sdiv by constant on AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 11:12:39 +00:00
Chandler Carruth	ce184e95f9	[x86] Add a ZERO_EXTEND_VECTOR_INREG DAG node and use it when widening vector types to be legal and a ZERO_EXTEND node is encountered. When we use widening to legalize vector types, extend nodes are a real challenge. Either the input or output is likely to be legal, but in many cases not both. As a consequence, we don't really have any way to represent this situation and the prior code in the widening legalization framework would just scalarize the extend operation completely. This patch introduces a new DAG node to represent doing a zero extend of a vector "in register". The core of the idea is to allow legal but different vector types in the input and output. The output vector must have fewer lanes but wider elements. The operation is defined to zero extend the low elements of the input to the size of the output elements, and drop all of the high elements which don't have a corresponding lane in the output vector. It also includes generic expansion of this node in terms of blending a zero vector into the high elements of the vector and bitcasting across. This in turn yields extremely nice code for x86 SSE2 when we use the new widening legalization logic in conjunction with the new shuffle lowering logic. There is still more to do here. We need to support sign extension, any extension, and potentially int-to-float conversions. My current plan is to continue using similar synthetic nodes to model each of these transitions with generic lowering code for each one. However, with this patch LLVM already reaches performance parity with GCC for the core C loops of the x264 code (assuming you disable the hand-written assembly versions) when compiling for SSE2 and SSE3 architectures and enabling the new widening and lowering logic for vectors. Differential Revision: http://reviews.llvm.org/D4405 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212610 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:58:18 +00:00
Daniel Sanders	1285b3130b	[mips][mips64r6] Correct select patterns that have the condition or true/false values backwards Summary: This bug caused SingleSource/Regression/C/uint64_to_float and SingleSource/UnitTests/2002-05-02-CastTest3 to fail (among others). Differential Revision: http://reviews.llvm.org/D4388 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212608 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:47:26 +00:00
Daniel Sanders	388704618e	[mips][mips64r6] Correct cond names in the cmp.cond.[ds] instructions Summary: It seems we accidentally read the wrong column of the table MIPS64r6 spec and used the names for c.cond.fmt instead of cmp.cond.fmt. Differential Revision: http://reviews.llvm.org/D4387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212607 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:40:20 +00:00
Chandler Carruth	a2a8a72f6f	[x86] Initialize a pointer to null to fix a bug in r212602. This should restore GCC hosts (which happen to put the bad stuff into the pointer) and MSan, etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212606 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:36:42 +00:00
Daniel Sanders	f08bcb9b97	[mips][mips64r6] Use JALR for indirect branches instead of JR (which is not available on MIPS32r6/MIPS64r6) Summary: This completes the change to use JALR instead of JR on MIPS32r6/MIPS64r6. Reviewers: jkolek, vmedic, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4269 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212605 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:21:59 +00:00
Daniel Sanders	7c2ef822f7	[mips][mips64r6] Use JALR for returns instead of JR (which is not available on MIPS32r6/MIPS64r6) Summary: RET, and RET_MM have been replaced by a pseudo named PseudoReturn. In addition a version with a 64-bit GPR named PseudoReturn64 has been added. Instruction selection for a return matches RetRA, which is expanded post register allocation to PseudoReturn/PseudoReturn64. During MipsAsmPrinter, this PseudoReturn/PseudoReturn64 are emitted as: - (JALR64 $zero, $rs) on MIPS64r6 - (JALR $zero, $rs) on MIPS32r6 - (JR_MM $rs) on microMIPS - (JR $rs) otherwise On MIPS32r6/MIPS64r6, 'jr $rs' is an alias for 'jalr $zero, $rs'. To aid development and review (specifically, to ensure all cases of jr are updated), these aliases are temporarily named 'r6.jr' instead of 'jr'. A follow up patch will change them back to the correct mnemonic. Added (JALR $zero, $rs) to MipsNaClELFStreamer's definition of an indirect jump, and removed it from its definition of a call. Note: I haven't accounted for MIPS64 in MipsNaClELFStreamer since it's doesn't appear to account for any MIPS64-specifics. The return instruction created as part of eh_return expansion is now expanded using expandRetRA() so we use the right return instruction on MIPS32r6/MIPS64r6 ('jalr $zero, $rs'). Also, fixed a misuse of isABI_N64() to detect 64-bit wide registers in expandEhReturn(). Reviewers: jkolek, vmedic, mseaborn, zoran.jovanovic, dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D4268 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212604 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:16:07 +00:00
Chandler Carruth	98eac0a244	[x86] Re-apply a variant of the x86 side of r212324 now that the rest has settled without incident, removing the x86-specific and overly strict 'isVectorSplat' routine in favor of generic and more powerful splat detection. The primary motivation and result of this is that the x86 backend can now see through splats which contain undef elements. This is essential if we are using a widening form of legalization and I've updated a test case to also run in that mode as before this change the generated code for the test case was completely scalarized. This version of the patch much more carefully handles the undef lanes. - We aren't overly conservative about them in the shift lowering (where we will never use the splat itself). - One place where the splat would have been re-used by the existing code now explicitly constructs a new constant splat that will be safe. - The broadcast lowering is much more reasonable with undefs by doing a correct check of whether the splat is the only user of a loaded value, checking that the splat actually crosses multiple lanes before using a broadcast, and handling broadcasts of non-constant splats. As a consequence of the last bullet, the weird usage of vpshufd instead of vbroadcast is gone, and we actually can lower an AVX splat with vbroadcastss where before we emitted a really strange pattern of a vector load and a manual splat across the vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212602 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 10:06:58 +00:00
NAKAMURA Takumi	717c9da9d2	MipsTargetStreamer.h: Avoid "using" to appease msc17. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 23:48:22 +00:00
Jim Grosbach	05bb7c5045	AArch64: Better codegen for loading from __fp16. Loading will generally extend to an f32 or an 64, so make sure to match those patterns directly to load into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-to-f64 path which was first converting to f32 and then to f64 from there. rdar://17594379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212573 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 23:28:48 +00:00
Ulrich Weigand	b7fdc7ff16	[PowerPC] Implement atomic NAND operations as actual NAND This changes the implementation of atomic NAND operations from "a & ~b" (compatible with GCC < 4.4) to actual "~(a & b)" (compatible with GCC >= 4.4). This is in line with the common-code and ARM back-end change implemented in r212433. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212547 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 16:16:02 +00:00
Daniel Sanders	7a16d24f8d	[mips] Fixed struct/class mismatch introduced in r212522. Clang emits a warning about this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212528 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 13:13:42 +00:00
Daniel Sanders	3c6b29cbde	Fix r212522 - [mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums Added two lines that should have been in r212522. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212523 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 10:35:52 +00:00
Daniel Sanders	fbdb8e1eac	[mips] Improve encapsulation of the .MIPS.abiflags implementation and limit scope of related enums Summary: Follow on to r212519 to improve the encapsulation and limit the scope of the enums. Also merged two very similar parser functions, fixed a bug where ASE's were not being reported, and marked CPR1's as being 128-bit when MSA is enabled. Differential Revision: http://reviews.llvm.org/D4384 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212522 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 10:11:38 +00:00
Renato Golin	aebcee661e	Revert "Refactor ARM subarchitecture parsing" This reverts commit `7b4a688246`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212521 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 10:06:16 +00:00
Arnaud A. de Grandmaison	60d8767211	Truncate the immediate in logical operation to the register width And continue to produce an error if the 32 most significant bits are not all ones or zeros. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212520 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 09:53:04 +00:00
Vladimir Medic	ffbc2a1325	Mips.abiflags is a new implicitly generated section that will be present on all new modules. The section contains a versioned data structure which represents essentially information to allow a program loader to determine the requirements of the application. This patch implements mips.abiflags section and provides test cases for it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212519 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 08:59:22 +00:00
Chandler Carruth	25b7d54e7f	[x86,SDAG] Sink the logic for folding shuffles of splats more aggressively from the x86 shuffle lowering to the generic SDAG vector shuffle formation code. This code already tried to fold away shuffles of splats! It just had lots of bugs and couldn't handle the case my new x86 shuffle lowering needed. First, it failed to correctly compute whether N2 was undef because it pre-computed this, then did transformations which could make N2 undef, then failed to ever re-consider the precomputed state. Second, it didn't look through bitcasts at all, even in the safe cases where they are just element-type bitcasts with no change to the number of elements. Third, it didn't handle all-zero bit casts nicely the way my code in the x86 side of things did, which is essential to getting good zext-shuffle lowerings. But all of these are generic. I just ported the code down to this layer and fixed the surrounding bugs. Tests exercising this in the x86 backend still pass and some silly code in widen_cast-6.ll gets better. I updated that test to be a bit more precise but it's still pretty unclear what the value of the test is in this day and age. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212517 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 08:45:38 +00:00
Adam Nemet	f189d9cdf7	[X86] AVX512: Only allow k1-k7 as predicates to vpcmp* As destination k0 is allowed but not as predicate/writemask. I also modified the test to allow checking of error messages by the assembler. I applied a similar approach to the test ret.s in the same directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212504 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 00:22:32 +00:00
Andrea Di Biagio	cfb83b7bac	[x86] Fix assertion failure caused by a wrong combine of PSHUFD nodes with different types. When combining a sequence of two PSHUFD dag nodes into a single PSHUFD, make sure that we assign the correct type to the resulting PSHUFD. X86ISD::PSHUFD dag nodes can be either MVT::v4i32 or MVT::v4f32. Before this change, an assertion failure was triggered in method 'DAGCombinerInfo::CombineTo' when trying to combine the shuffles from the test below into a single PSHUFD. define <4 x float> @test1(<4 x float> %V) { %1 = shufflevector <4 x float> %V, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> %2 = shufflevector <4 x float> %1, <4 x float> undef, <4 x i32> <i32 3, i32 0, i32 2, i32 1> ret <4 x float> %2 } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212498 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 23:25:23 +00:00
Juergen Ributzka	1154be8198	[FastISel][X86] Fix smul.with.overflow.i8 lowering. Add custom lowering code for signed multiply instruction selection, because the default FastISel instruction selection for ISD::MUL will use unsigned multiply for the i8 type and signed multiply for all other types. This would set the incorrect flags for the overflow check. This fixes <rdar://problem/17549300> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212493 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 21:52:21 +00:00
Louis Gerbarg	e7f8191b18	Allow AArch64FastISel to degrade graceully in the presence of an MVT::i128 Currently AArch64FastISel crashes if it tries to extend an integer into an MVT::i128. This can happen by creating 128 bit integers like so: typedef unsigned int uint128_t __attribute__((mode(TI))); typedef int sint128_t __attribute__((mode(TI))); This patch makes EmitIntExt check for their presence and then falls back to SelectionDAG. Tests included. rdar://17516686 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212492 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 21:37:51 +00:00
Renato Golin	7b4a688246	Refactor ARM subarchitecture parsing According to a FIXME in ARMMCTargetDesc.cpp the ARM version parsing should be in the Triple helper class. Patch by: Gabor Ballabas git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212479 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 20:01:11 +00:00
Ulrich Weigand	b053ddc909	[PowerPC] Fix no-assert build r212476 caused a compile failure (unused variable) in a non-assertion build ... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212477 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 19:39:44 +00:00
Ulrich Weigand	bf7bfe3549	[PowerPC] Fix "byval align" arguments Arguments passed as "byval align" should get the specified alignment in the parameter save area. There was some code in PPCISelLowering.cpp that attempted to implement this, but this didn't work correctly: while code did update the ArgOffset value, it neglected to update the PtrOff value (which was already computed from the old ArgOffset), and it also neglected to update GPR_idx -- fields skipped due to alignment in the save area must likewise be skipped in GPRs. This patch fixes and simplifies this logic by: - handling argument offset alignment right at the beginning of argument processing, using a new helper routine CalculateStackSlotAlignment (this avoids having to update PtrOff and other derived values later on) - not tracking GPR_idx separately, but always computing the correct GPR_idx for each argument from its ArgOffset - removing some redundant computation in LowerFormalArguments: MinReservedArea must equal ArgOffset after argument processing, so there's no use in computing it twice. [This doesn't change the behavior of the current clang front-end, since that never creates "byval align" arguments at the moment. This will change with a follow-on patch, however.] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212476 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 19:26:41 +00:00
Chandler Carruth	7fcb422bb2	[x86] Revert r212324 which was too aggressive w.r.t. allowing undef lanes in vector splats. The core problem here is that undef lanes can't unilaterally be considered to contribute to splats. Their handling needs to be more cautious. There is also a reported failure of the nightly testers (thanks Tobias!) that may well stem from the same core issue. I'm going to fix this theoretical issue, factor the APIs a bit better, and then verify that I don't see anything bad with Tobias's reduction from the test suite before recommitting. Original commit message for r212324: [x86] Generalize BuildVectorSDNode::getConstantSplatValue to work for any constant, constant FP, or undef splat and to tolerate any undef lanes in a splat, then replace all uses of isSplatVector in X86's lowering with it. This fixes issues where undef lanes in an otherwise splat vector would prevent the splat logic from firing. It is a touch more awkward to use this interface, but it is much more accurate. Suggestions for better interface structuring welcome. With this fix, the code generated with the widening legalization strategy for widen_cast-4.ll is dramatically improved as the special lowering strategies for a v16i8 SRA kick in even though the high lanes are undef. We also get a slightly different choice for broadcasting an aligned memory location, and use vpshufd instead of vbroadcastss. This looks like a minor win for pipelining and domain crossing, but a minor loss for the number of micro-ops. I suspect its a wash, but folks can easily tweak the lowering if they want. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212475 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 19:03:32 +00:00
Matt Arsenault	0e1619e77c	R600: Fix mishandling of load / store chains. Fixes various bugs with reordering loads and stores. Scalarized vector loads weren't collecting the chains at all. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212473 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 18:34:45 +00:00
Matt Arsenault	7b1c5f52b0	Fix typo, weird indentation git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212472 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 18:34:42 +00:00
Tim Northover	a15d70370e	X86: revert unintentional change to X86FastISel. This crept in with r212443. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212459 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 14:06:42 +00:00
Evgeniy Stepanov	5ba71b01d8	[asan] Generate asm instrumentation in MC. Generate entire ASan asm instrumentation in MC without relying on runtime helper functions. Patch by Yuri Gorshenin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212455 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 13:57:37 +00:00
Chandler Carruth	36ad61f4ea	[x86] Teach the new vector shuffle lowering code to handle what is essentially a DAG combine that never gets a chance to run. We might typically expect DAG combining to remove shuffles-of-splats and other similar patterns, but we don't get a chance to run the DAG combiner when we recursively form sub-shuffles during the lowering of a shuffle. So instead hand-roll a really important combine directly into the lowering code to detect shuffles-of-splats, especially shuffles of an all-zero splat which needn't even have the same element width, etc. This lets the new vector shuffle lowering handle shuffles which implement things like zero-extension really nicely. This will become even more important when I wire the legalization of zero-extension to vector shuffles with the new widening legalization strategy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212444 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 09:06:58 +00:00
Tim Northover	3e16b022be	CodeGen: it turns out that NAND is not the same thing as BIC. At all. We've been performing the wrong operation on ARM for "atomicrmw nand" for years, since "a NAND b" is "~(a & b)" rather than ARM's very tempting "a & ~b". This bled over into the generic expansion pass. So I assume no-one has ever actually tried to do an atomic nand in the real world. Oh well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212443 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-07 09:06:35 +00:00

1 2 3 4 5 ...

29897 Commits