llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-10-25 10:27:04 +00:00

Author	SHA1	Message	Date
Hans Wennborg	ce540c1084	Merging r243057: ------------------------------------------------------------------------ r243057 \| spatel \| 2015-07-23 15:56:53 -0700 (Thu, 23 Jul 2015) \| 16 lines fix crash in machine trace metrics due to processing dbg_value instructions (PR24199) The test in PR24199 ( https://llvm.org/bugs/show_bug.cgi?id=24199 ) crashes because machine trace metrics was not ignoring dbg_value instructions when calculating data dependencies. The machine-combiner pass asks machine trace metrics to calculate an instruction trace, does some reassociations, and calls MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval() along with MachineTraceMetrics::invalidate(). The dbg_value instructions have their operands invalidated, but the instructions are not expected to be deleted. On a subsequent loop iteration of the machine-combiner pass, machine trace metrics would be called again and die while accessing the invalid debug instructions. Differential Revision: http://reviews.llvm.org/D11423 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243662 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 17:17:47 +00:00
Hans Wennborg	d0702afadf	Merging r243638 and r243640: ------------------------------------------------------------------------ r243638 \| vkalintiris \| 2015-07-30 05:39:33 -0700 (Thu, 30 Jul 2015) \| 12 lines [mips][FastISel] Remove hidden mips-fast-isel option. Summary: This hidden option would disable code generation through FastISel by default. It was removed from the available options and from the Fast-ISel tests that required it in order to run the tests. Reviewers: dsanders Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D11610 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r243640 \| vkalintiris \| 2015-07-30 06:13:09 -0700 (Thu, 30 Jul 2015) \| 5 lines [mips] Fix out-of-date debug information in test file. Update the debug info in the check-lines because the change in r243638 introduced a constant initialization before the prologue's end as part of a register spill. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243650 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:18:53 +00:00
Hans Wennborg	6055b680c7	Merging r243636: ------------------------------------------------------------------------ r243636 \| vkalintiris \| 2015-07-30 04:51:44 -0700 (Thu, 30 Jul 2015) \| 34 lines [mips][FastISel] Apply only zero-extension to constants prior to their materialization. Summary: Previously, we would sign-extend non-boolean negative constants and zero-extend otherwise. This was problematic for PHI instructions with negative values that had a type with bitwidth less than that of the register used for materialization. More specifically, ComputePHILiveOutRegInfo() assumes the constants present in a PHI node are zero extended in their container and afterwards deduces the known bits. For example, previously we would materialize an i16 -4 with the following instruction: addiu $r, $zero, -4 The register would end-up with the 32-bit 2's complement representation of -4. However, ComputePHILiveOutRegInfo() would generate a constant with the upper 16-bits set to zero. The SelectionDAG builder would use that information to generate an AssertZero node that would remove any subsequent trunc & zero_extend nodes. In theory, we should modify ComputePHILiveOutRegInfo() to consult target-specific hooks about the way they prefer to materialize the given constants. However, git-blame reports that this specific code has not been touched since 2011 and it seems to be working well for every target so far. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11592 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243648 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:16:42 +00:00
Hans Wennborg	dc99a4f513	Merging r243485: ------------------------------------------------------------------------ r243485 \| vkalintiris \| 2015-07-28 14:43:31 -0700 (Tue, 28 Jul 2015) \| 12 lines [mips][FastISel] Fix call lowering by bailing out on "fastcc" calls. Summary: Currently, we support only the MIPS O32 ABI calling convention for call lowering. With this change we avoid using the O32 calling convetion for lowering calls marked as using the fast calling convention. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11515 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243647 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:15:22 +00:00
Hans Wennborg	53a63a9874	Merging r243469: ------------------------------------------------------------------------ r243469 \| vkalintiris \| 2015-07-28 12:57:25 -0700 (Tue, 28 Jul 2015) \| 12 lines [mips][FastISel] Fix generated code for IR's select instruction. Summary: Generate correct code for the select instruction by zero-extending it's boolean/condition operand to GPR-width. This is necessary because the conditional-move instructions operate on the whole register. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11506 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243646 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-30 16:14:05 +00:00
Hans Wennborg	61c6f9e581	Merging r243519: ------------------------------------------------------------------------ r243519 \| wschmidt \| 2015-07-29 07:31:57 -0700 (Wed, 29 Jul 2015) \| 14 lines [PPC] Fix PR24216: Don't generate splat for misaligned shuffle mask Given certain shuffle-vector masks, LLVM emits splat instructions which splat the wrong bytes from the source register. The issue is that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp does not ensure that the splat pattern found is requesting bytes that are aligned on an EltSize boundary. This patch detects this situation as not a valid splat mask, resulting in a permute being generated instead of a splat. Patch and test case by Tyler Kenney, cleaned up a bit by me. This is a simple bug fix that would be good to incorporate into 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243528 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-29 15:58:34 +00:00
Hans Wennborg	1008467523	Merging r243500: (conflicts resolved manually since the branch doesn't have r243293) ------------------------------------------------------------------------ r243500 \| spatel \| 2015-07-28 16:28:22 -0700 (Tue, 28 Jul 2015) \| 16 lines ignore duplicate divisor uses when transforming into reciprocal multiplies (PR24141) PR24141: https://llvm.org/bugs/show_bug.cgi?id=24141 contains a test case where we have duplicate entries in a node's uses() list. After r241826, we use CombineTo() to delete dead nodes when combining the uses into reciprocal multiplies, but this fails if we encounter the just-deleted node again in the list. The solution in this patch is to not add duplicate entries to the list of users that we will subsequently iterate over. For the test case, this avoids triggering the combine divisors logic entirely because there really is only one user of the divisor. Differential Revision: http://reviews.llvm.org/D11345 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243524 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-29 15:38:37 +00:00
Hans Wennborg	d6482ace16	Merging r243361: ------------------------------------------------------------------------ r243361 \| spatel \| 2015-07-27 17:48:32 -0700 (Mon, 27 Jul 2015) \| 17 lines fix invalid load folding with SSE/AVX FP logical instructions (PR22371) This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ). I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy when targeting 32-bit x86 and causes a miscompile/crash in Wine: https://bugs.winehq.org/show_bug.cgi?id=38826 https://llvm.org/bugs/show_bug.cgi?id=22371#c25 The quick fix is to simply remove the scalar FP logical instructions from the load folding table in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs, fneg, fcopysign. So the majority of this patch is altering those lowerings to use vector FP logical instructions (because that's all x86 gives us anyway). That lets us do the load folding legally. Differential Revision: http://reviews.llvm.org/D11477 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243435 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-28 16:20:00 +00:00
Hans Wennborg	f09b9c933a	Merging r243294: ------------------------------------------------------------------------ r243294 \| mareko \| 2015-07-27 11:16:08 -0700 (Mon, 27 Jul 2015) \| 9 lines AMDGPU: don't match vgpr loads for constant loads Author: Dave Airlie <airlied@redhat.com> In order to implement indirect sampler loads, we don't want to match on a VGPR load but an SGPR one for constants, as we cannot feed VGPRs to the sampler only SGPRs. this should be applicable for llvm 3.7 as well. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243317 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 20:19:04 +00:00
Hans Wennborg	f4c16a9237	Merging r243263: ------------------------------------------------------------------------ r243263 \| mareko \| 2015-07-27 04:37:42 -0700 (Mon, 27 Jul 2015) \| 3 lines AMDGPU/SI: Fix the V_FRACT_F64 SI bug workaround This is a candidate for 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@243316 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-27 20:17:19 +00:00
Hans Wennborg	ab8b2f1f64	Merge r242733, r242734, r242735 and r242742 (r242742 is the interesting patch here, but I picked the others too to get a clean merge since there's been some back-and-forth on this file.) ------------------------------------------------------------------------ r242733 \| matze \| 2015-07-20 16:17:14 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARM: Use SpecificBumpPtrAllocator to fix leak introduced in r241920" This reverts commit r241951. It caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242734 \| matze \| 2015-07-20 16:17:16 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARMLoadStoreOpt: Merge subs/adds into LDRD/STRD; Factor out common code" This reverts commit r241928. This caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242735 \| matze \| 2015-07-20 16:17:20 -0700 (Mon, 20 Jul 2015) \| 3 lines Revert "ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2" This reverts commit r241926. This caused http://llvm.org/PR24190 ------------------------------------------------------------------------ ------------------------------------------------------------------------ r242742 \| matze \| 2015-07-20 17:18:59 -0700 (Mon, 20 Jul 2015) \| 7 lines ARMLoadStoreOptimizer: Create LDRD/STRD on thumb2 Re-apply r241926 with an additional check that r13 and r15 are not used for LDRD/STRD. See http://llvm.org/PR24190. This also already includes the fix from r241951. Differential Revision: http://reviews.llvm.org/D10623 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242907 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-22 16:13:29 +00:00
Hans Wennborg	db5f467332	Merging r242680: ------------------------------------------------------------------------ r242680 \| wschmidt \| 2015-07-20 08:43:21 -0700 (Mon, 20 Jul 2015) \| 1 line Add missing test for r242296 (vec_sld) ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:47:36 +00:00
Hans Wennborg	b26a5f5b1e	Merging r242673: ------------------------------------------------------------------------ r242673 \| tstellar \| 2015-07-20 07:28:41 -0700 (Mon, 20 Jul 2015) \| 11 lines AMDGPU/SI: Add VI patterns to select FLAT instructions for global memory ops Summary: The MUBUF addr64 bit has been removed on VI, so we must use FLAT instructions when the pointer is stored in VGPRs. Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11067 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242685 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:46:01 +00:00
Hans Wennborg	dffc572cbe	Merging r242434: ------------------------------------------------------------------------ r242434 \| tstellar \| 2015-07-16 12:40:09 -0700 (Thu, 16 Jul 2015) \| 7 lines AMDPGU/SI: Negative offsets aren't allowed in MUBUF's vaddr operand Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11226 ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242684 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-20 16:43:33 +00:00
Hans Wennborg	bd53caa737	Merging r242442: ------------------------------------------------------------------------ r242442 \| wschmidt \| 2015-07-16 14:14:07 -0700 (Thu, 16 Jul 2015) \| 14 lines [PowerPC] v4i32 is a VSRCRegClass I was looking at some vector code generation and kept seeing unnecessary vector copies into the Altivec half of the VSX registers. I discovered that we overlooked v4i32 when adding the register classes for VSX; we only added v4f32 and v2f64. This means that anything that canonicalizes into v4i32 (which is a LOT of stuff) ends up being forced into VRRC on its way to VSRC. The fix is one line. The rest of the patch is fixing up some test cases whose code generation has changed as a result. This seems like it would be a good candidate for backport to 3.7. ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242447 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-16 21:41:21 +00:00
Hans Wennborg	445ea38ee9	Merging r242239: ------------------------------------------------------------------------ r242239 \| hfinkel \| 2015-07-14 15:53:11 -0700 (Tue, 14 Jul 2015) \| 4 lines [PowerPC] Support symbolic targets in patchpoints Follow-up r235483, with the corresponding support in PPC. We use a regular call for symbolic targets (because they're much cheaper than indirect calls). ------------------------------------------------------------------------ git-svn-id: https://llvm.org/svn/llvm-project/llvm/branches/release_37@242325 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-15 20:27:43 +00:00
Hal Finkel	a8eaf29f90	[PowerPC] Use the ABI indirect-call protocol for patchpoints We used to take the address specified as the direct target of the patchpoint and did no TOC-pointer handling. This, however, as not all that useful, because MCJIT tends to create a lot of modules, and they have their own TOC sections. Thus, to call from the generated code to other generated code, you really need to switch TOC pointers. Make this work as expected, and under ELFv1, tread the address as the function descriptor address so that the correct TOC pointer can be loaded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242217 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 22:26:06 +00:00
Alex Lorenz	6e50c921d0	MIR Serialization: Serialize the machine basic block live in registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242204 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 21:24:41 +00:00
Hal Finkel	13141f04d3	[PowerPC] Fix the PPCInstrInfo::getInstrLatency implementation PowerPC uses itineraries to describe processor pipelines (and dispatch-group restrictions for P7/P8 cores). Unfortunately, the target-independent implementation of TII.getInstrLatency calls ItinData->getStageLatency, and that looks for the largest cycle count in the pipeline for any given instruction. This, however, yields the wrong answer for the PPC itineraries, because we don't encode the full pipeline. Because the functional units are fully pipelined, we only model the initial stages (there are no relevant hazards in the later stages to model), and so the technique employed by getStageLatency does not really work. Instead, we should take the maximum output operand latency, and that's what PPCInstrInfo::getInstrLatency now does. This caused some test-case churn, including two unfortunate side effects. First, the new arrangement of copies we get from function parameters now sometimes blocks VSX FMA mutation (a FIXME has been added to the code and the test cases), and we have one significant test-suite regression: SingleSource/Benchmarks/BenchmarkGame/spectral-norm 56.4185% +/- 18.9398% In this benchmark we have a loop with a vectorized FP divide, and it with the new scheduling both divides end up in the same dispatch group (which in this case seems to cause a problem, although why is not exactly clear). The grouping structure is hard to predict from the bottom of the loop, and there may not be much we can do to fix this. Very few other test-suite performance effects were really significant, but almost all weakly favor this change. However, in light of the issues highlighted above, I've left the old behavior available via a command-line flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 20:02:02 +00:00
Krzysztof Parzyszek	d496e176f0	[Hexagon] Generate instructions for operations on predicate registers Convert logical operations on general-purpose registers to the correspon- ding operations on predicate registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242186 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 19:30:21 +00:00
Keno Fischer	890c16626f	[CodeGen] Force emission of personality directive if explicitly specified Summary: Before this change, personality directives were not emitted if there was no invoke left in the function (of course until recently this also meant that we couldn't know what the personality actually was). This patch forces personality directives to still be emitted, unless it is known to be a noop in the absence of invokes, or the user explicitly specified `nounwind` (and not `uwtable`) on the function. Reviewers: majnemer, rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D10884 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242185 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 19:22:51 +00:00
Matt Arsenault	ba38e6c2ae	AMDGPU: Avoid using 64-bit shift for i64 (shl x, 32) This can be done only with moves which theoretically will optimize better later. Although this transform increases the instruction count, it should be code size / cycle count neutral in the worst VALU case. It also seems to slightly improve a couple of testcases due to other DAG combines this exposes. This is probably slightly worse for the SALU case, so it might be better to handle this during moveToVALU, although then you lose some simplifications like the load width reducing in the simple testcase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242177 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 18:20:33 +00:00
Matt Arsenault	3aa0d7cb53	AMDGPU/SI: Fix read2 merging into a super register. If the read2 produced was supposed to be writing into a super register, it would use the wrong subregister indices. Fix this by inserting copies, so we only ever write to a vreg_64. Run the register coalescer again to clean this up, although this isn't ideal and often does result in an extra move. Also remove the assert that offset1 > offset0. There isn't a real reason to not allow this other than a minor convenience in the compiler, and it doesn't seem worth the effort of avoiding it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242174 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:57:36 +00:00
Nemanja Ivanovic	582194d3b8	Add missing builtins to the PPC back end for ABI compliance (vol. 4) This patch corresponds to review: http://reviews.llvm.org/D11183 Back end portion of the fourth round of additions to altivec.h. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242167 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:25:20 +00:00
Tim Northover	18ec07dece	ARM: add at least one real test for r242123. The ones committed were orthogonal to the change and would have passed before that revision. What it did do was prevent an assertion failure when generating object files. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242166 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:23:55 +00:00
Matthias Braun	a36268215f	PrologEpilogInserter: Rewrite API to determine callee save regsiters. This changes TargetFrameLowering::processFunctionBeforeCalleeSavedScan(): - Rename the function to determineCalleeSaves() - Pass a bitset of callee saved registers by reference, thus avoiding the function-global PhysRegUsed bitset in MachineRegisterInfo. - Without PhysRegUsed the implementation is fine tuned to not save physcial registers which are only read but never modified. Related to rdar://21539507 Differential Revision: http://reviews.llvm.org/D10909 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242165 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:17:13 +00:00
Krzysztof Parzyszek	14e60218b6	[Hexagon] Generate "extract" instructions more aggressively Generate extract instructions (via intrinsics) before the DAG combiner folds shifts into unrecognizable forms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242163 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 17:07:24 +00:00
Tom Stellard	adb194b458	AMDGPU/SI: Add support for shrinking v_cndmask_b32_e32 instructions Reviewers: arsenm Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D11061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242146 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 14:15:03 +00:00
Yaron Keren	6f1e023b46	Generate correct asm info for mingw and cygwin ARM targets. http://reviews.llvm.org/D11075 Patch by Martell Malone Reviewed by Reid Kleckner git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242123 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 05:51:05 +00:00
NAKAMURA Takumi	2c022eaa9b	Give an explicit triple to llvm/test/CodeGen/X86/pr13577.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242111 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 03:07:06 +00:00
Matthias Braun	21910d8bd5	Revert "LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization" Accidental commit, needs review first. This reverts commit r242107. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242108 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 02:09:57 +00:00
Matthias Braun	2a46e4c020	LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization - Factor out code to query and modify the sign bit of a floatingpoint value as an integer. This also works if none of the targets integer types is big enough to hold all bits of the floatingpoint value. - Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available, otherwise perform bit manipulation on the sign bit. The previous code used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also takes 34 instructions on ARM Cortex-M4. With this patch we only require 5: vldr d0, LCPI0_0 vmov r2, r3, d0 lsrs r2, r3, #31 bfi r1, r2, #31, #1 bx lr (This could be further improved if the compiler would recognize that r2, r3 is zero). - Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is available otherwise perform bit manipulation on the sign bit. - Perform the sign(x) test by masking out the sign bit and comparing with 0 rather than shifting the sign bit to the highest position and testing for "<s 0". For x86 copysignl (on 80bit values) this gets us: testl $32768, %eax rather than: shlq $48, %rax sets %al testb %al, %al git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242107 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 02:08:26 +00:00
Matthias Braun	dbe717878a	X86: Check output of x86 copysignl testcase. This makes the changes in an upcoming patch visible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242106 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 02:08:23 +00:00
Alex Lorenz	dee03ee0f9	MIR Serialization: Serialize the variable sized stack objects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242095 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-14 00:26:26 +00:00
Alex Lorenz	c249168837	MIR Serialization: Serialize the sub register indices. This commit serializes the sub register indices from the register machine operands. Reviewers: Duncan P. N. Exon Smith git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242084 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 23:24:34 +00:00
Bill Schmidt	045b2171c4	[PPC64LE] More improvements to VSX swap optimization This patch allows VSX swap optimization to succeed more frequently. Specifically, it is concerned with common code sequences that occur when copying a scalar floating-point value to a vector register. This patch currently handles cases where the floating-point value is already in a register, but does not yet handle loads (such as via an LXSDX scalar floating-point VSX load). That will be dealt with later. A typical case is when a scalar value comes in as a floating-point parameter. The value is copied into a virtual VSFRC register, and then a sequence of SUBREG_TO_REG and/or COPY operations will convert it to a full vector register of the class required by the context. If this vector register is then used as part of a lane-permuted computation, the original scalar value will be in the wrong lane. We can fix this by adding a swap operation following any widening SUBREG_TO_REG operation. Additional COPY operations may be needed around the swap operation in order to keep register assignment happy, but these are pro forma operations that will be removed by coalescing. If a scalar value is otherwise directly referenced in a computation (such as by one of the many XS* vector-scalar operations), we currently disable swap optimization. These operations are lane-sensitive by definition. A MentionsPartialVR flag is added for use in each swap table entry that mentions a scalar floating-point register without having special handling defined. A common idiom for PPC64LE is to convert a double-precision scalar to a vector by performing a splat operation. This ensures that the value can be referenced as V[0], as it would be for big endian, whereas just converting the scalar to a vector with a SUBREG_TO_REG operation leaves this value only in V[1]. A doubleword splat operation is one form of an XXPERMDI instruction, which takes one doubleword from a first operand and another doubleword from a second operand, with a two-bit selector operand indicating which doublewords are chosen. In the general case, an XXPERMDI can be permitted in a lane-swapped region provided that it is properly transformed to select the corresponding swapped values. This transformation is to reverse the order of the two input operands, and to reverse and complement the bits of the selector operand (derivation left as an exercise to the reader ;). A new test case that exercises the scalar-to-vector and generalized XXPERMDI transformations is added as CodeGen/PowerPC/swaps-le-5.ll. The patch also requires a change to CodeGen/PowerPC/swaps-le-3.ll to use CHECK-DAG instead of CHECK for two independent instructions that now appear in reverse order. There are two small unrelated changes that are added with this patch. First, the XXSLDWI instruction was incorrectly omitted from the list of lane-sensitive instructions; this is now fixed. Second, I observed that the same webs were being rejected over and over again for different reasons. Since it's sufficient to reject a web only once, I added a check for this to speed up the compilation time slightly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242081 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 22:58:19 +00:00
Reid Kleckner	da9c587dad	[WinEH] Emit the LSDA even if no lpads remain but outlining occurred The outlined funclets call intrinsics which reference labels from the LSDA. This situation can easily arise in small functions with a single cleanup at -O0, where Clang marks a definition as nounwind, and then WinEHPrepare "discovers" that the landingpad is dead by accident and deletes it. We now need to ask the LLVM IR Function for it's personality directly, rather than going through MachineModuleInfo. Fixes PR23892. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242063 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 20:41:46 +00:00
Alex Lorenz	ad7556d177	MIR Serialization: Serialize the fixed stack objects. This commit serializes the fixed stack objects, including fixed spill slots. The fixed stack objects are serialized using a YAML sequence of YAML inline mappings. Each mapping has the object's ID, type, size, offset, and alignment. The objects that aren't spill slots also serialize the isImmutable and isAliased flags. The fixed stack objects are a part of the machine function's YAML mapping. Reviewers: Duncan P. N. Exon Smith git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242045 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 18:07:26 +00:00
Reid Kleckner	c6d1cc7e16	[WinEH] Strip the \01 character from the __CxxFrameHandler3 thunk name Add another C++ 32-bit EH table test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242044 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 17:55:14 +00:00
James Y Knight	af8cf90e2f	Fix handling of the 'n' asm constraint with invalid operands. It had accidently accepted a symbol+offset value (and emitted incorrect code for it, keeping only the offset part) instead of properly reporting the constraint as invalid. Differential Revision: http://reviews.llvm.org/D11039 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242040 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 16:36:22 +00:00
Tom Stellard	f5be357d37	AMDGPU/SI: Select mad patterns to v_mac_f32 The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 15:47:57 +00:00
Logan Chien	af3e4a2f2f	ARM: Fix cttz expansion on vector types. The 64/128-bit vector types are legal if NEON instructions are available. However, there was no matching patterns for @llvm.cttz.*() intrinsics and result in fatal error. This commit fixes the problem by lowering cttz to: a. ctpop((x & -x) - 1) b. width - ctlz(x & -x) - 1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242037 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 15:37:30 +00:00
Rafael Espindola	5572685621	Print the visibility of available_externally functions. We were already printing it for declarations, but not available_externally. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242027 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 13:55:18 +00:00
Elena Demikhovsky	a0a51734cd	AVX-512: Added all AVX-512 forms of Vector Convert for Float/Double/Int/Long types. In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch. I temporary removed the old intrinsics test (just to split this patch). Half types are not covered here. Differential Revision: http://reviews.llvm.org/D11134 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242023 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 13:26:20 +00:00
Renato Golin	4173058d07	[ARM] Add support for nest attribute using r12 Register r12 ('ip') is used by GCC for this purpose and hence is used here. As discussed on the GCC mailing list, the register choice is an ABI issue and so choosing the same register as GCC means __builtin_call_with_static_chain is compatible. A similar patch has just gone in the AArch64 backend, so this is just the ARM counterpart, following the same discussion. Patch by Stephen Cross. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241996 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-12 18:16:40 +00:00
Simon Pilgrim	07c08a6a50	[X86][SSE] Tidied up vector extend/truncation tests. NFCI. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241995 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-12 17:40:49 +00:00
Simon Pilgrim	f9df477221	[X86][SSE] Vectorized v4i32 non-uniform shifts. While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized. This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together. Differential Revision: http://reviews.llvm.org/D11063 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241989 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-12 11:15:19 +00:00
Hal Finkel	866cf31c07	[PowerPC] Make use of the TargetRecip system r238842 added the TargetRecip system for controlling use of reciprocal estimates for sqrt and division using a set of parameters that can be set by the frontend. Clang now supports a sophisticated -mrecip option, and this will allow that option to effectively control the relevant code-generation functionality of the PPC backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241985 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-12 02:33:57 +00:00
Hal Finkel	d14325bee9	[PowerPC] Support the nest parameter attribute This adds support for the 'nest' attribute, which allows the static chain register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11 is the chain register (which the PPC64 ELF ABI calls the "environment pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded from the function descriptor, but providing an explicit 'nest' parameter will override that process and use the value provided. This allows __builtin_call_with_static_chain to work as expected on PowerPC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241984 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-12 00:37:44 +00:00
Alex Lorenz	1cca87a981	MIR Serialization: Serialize the virtual register operands. Reviewers: Duncan P. N. Exon Smith Differential Revision: http://reviews.llvm.org/D11005 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241959 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-10 22:51:20 +00:00

1 2 3 4 5 ...

14006 Commits