llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-12 00:29:20 +00:00

Author	SHA1	Message	Date
Hal Finkel	ae06fa2542	Fix cpu on test CodeGen/PowerPC/ctrloop-fp64.ll We need ppc instead of generic to override native features on ppc machines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182049 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 20:28:05 +00:00
Hal Finkel	c482454e3c	Create an new preheader in PPCCTRLoops to avoid counter register clobbers Some IR-level instructions (such as FP <-> i64 conversions) are not chained w.r.t. the mtctr intrinsic and yet may become function calls that clobber the counter register. At the selection-DAG level, these might be reordered with the mtctr intrinsic causing miscompiles. To avoid this situation, if an existing preheader has instructions that might use the counter register, create a new preheader for the mtctr intrinsic. This extra block will be remerged with the old preheader at the MI level, but will prevent unwanted reordering at the selection-DAG level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182045 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 19:58:38 +00:00
Ulrich Weigand	347a5079e1	[PowerPC] Use true offset value in "memrix" machine operands This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182032 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 17:58:02 +00:00
Hal Finkel	2a5e8c328e	PPC32 cannot form counter loops around i64 FP conversions On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182023 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:52:41 +00:00
Bill Schmidt	0d6423b476	Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.ll While testing some experimental code to add vector-scalar registers to PowerPC, I noticed that a couple of independent instructions were flipped by the scheduler. The new CHECK-DAG support is perfect for avoiding this problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182020 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:15:18 +00:00
Ulrich Weigand	f0ef882828	[PowerPC] Report true displacement value from getPreIndexedAddressParts DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to decompose a memory address into a base/offset pair. It expects the offset (if constant) to be the true displacement value in order to perform optional additional optimizations; in particular, to convert other uses of the original pointer into uses of the new base pointer after pre-increment. The PowerPC implementation of getPreIndexedAddressParts, however, simply calls SelectAddressRegImm, which returns a TargetConstant. This value is appropriate for encoding into the instruction, but it is not always usable as true displacement value: - Its type is always MVT::i32, even on 64-bit, where addresses ought to be i64 ... this causes the optimization to simply always fail on 64-bit due to this line in DAGCombiner: // FIXME: In some cases, we can be smarter about this. if (Op1.getValueType() != Offset.getValueType()) { - Its value is truncated to an unsigned 16-bit value if negative. This causes the above opimization to generate wrong code. This patch fixes both problems by simply returning the true displacement value (in its original type). This doesn't affect any other user of the displacement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182012 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 14:53:05 +00:00
Rafael Espindola	aba2d6d051	Extend test to check the .cfi instructions. I am about to refactor the calls to addFrameMove and some of the ppc ones were not being tested. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182009 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 14:30:09 +00:00
Rafael Espindola	0225d5a3af	Extend test for better coverage. Without this change nothing was covering this addFrameMove: // For 64-bit SVR4 when we have spilled CRs, the spill location // is SP+8, not a frame-relative slot. if (Subtarget.isSVR4ABI() && Subtarget.isPPC64() && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) { MachineLocation CSDst(PPC::X1, 8); MachineLocation CSSrc(PPC::CR2); MMI.addFrameMove(Label, CSDst, CSSrc); continue; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181976 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 03:48:50 +00:00
Hal Finkel	b1fd3cd78f	Implement PPC counter loops as a late IR-level pass The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181927 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 21:37:41 +00:00
Bill Schmidt	ded53bf4dd	PPC32: Fix stack collision between FP and CR save areas. The changes to CR spill handling missed a case for 32-bit PowerPC. The code in PPCFrameLowering::processFunctionBeforeFrameFinalized() checks whether CR spill has occurred using a flag in the function info. This flag is only set by storeRegToStackSlot and loadRegFromStackSlot. spillCalleeSavedRegisters does not call storeRegToStackSlot, but instead produces MI directly. Thus we don't see the CR is spilled when assigning frame offsets, and the CR spill ends up colliding with some other location (generally the FP slot). This patch sets the flag in spillCalleeSavedRegisters for PPC32 so that the CR spill is properly detected and gets its own slot in the stack frame. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181800 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 16:08:32 +00:00
Bill Schmidt	240b9b6078	PPC64: Constant initializers with dynamic relocations go in .data.rel.ro. This fixes warning messages observed in the oggenc application test in projects/test-suite. Special handling is needed for the 64-bit PowerPC SVR4 ABI when a constant is initialized with a pointer to a function in a shared library. Because a function address is implemented as the address of a function descriptor, the use of copy relocations can lead to problems with initialization. GNU ld therefore replaces copy relocations with dynamic relocations to be resolved by the dynamic linker. This means the constant cannot reside in the read-only data section, but instead belongs in .data.rel.ro, which is designed for constants containing dynamic relocations. The implementation creates a class PPC64LinuxTargetObjectFile inheriting from TargetLoweringObjectFileELF, which behaves like its parent except to place constants of this sort into .data.rel.ro. The test case is reduced from the oggenc application. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181723 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-13 19:34:37 +00:00
Bill Schmidt	a7f2ce85e5	Fix handling of anonymous aggregate parameters for powerpc*-apple-darwin8. This fixes bug 15821 similarly to the powerpc64-linux fix for bug 14779. Patch by David Fang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181449 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-08 17:22:33 +00:00
Hal Finkel	b45eb9fd27	PPCInstrInfo::optimizeCompareInstr should not optimize FP compares The floating-point record forms on PPC don't set the condition register bits based on a comparison with zero (like the integer record forms do), but rather based on the exception status bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181423 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-08 12:16:14 +00:00
Hal Finkel	db31bd31d6	LocalStackSlotAllocation improvements First, taking advantage of the fact that the virtual base registers are allocated in order of the local frame offsets, remove the quadratic register-searching behavior. Because of the ordering, we only need to check the last virtual base register created. Second, store the frame index in the FrameRef structure, and get the frame index and the local offset from this structure at the top of the loop iteration. This allows us to de-nest the loops in insertFrameReferenceRegisters (and I think makes the code cleaner). I also moved the needsFrameBaseReg check into the first loop over instructions so that we don't bother pushing FrameRefs for instructions that don't want a virtual base register anyway. Lastly, and this is the only functionality change, avoid the creation of single-use virtual base registers. These are currently not useful because, in general, they end up replacing what would be one r+r instruction with an add and a r+i instruction. Committing this removes the XFAIL in CodeGen/PowerPC/2007-09-07-LoadStoreIdxForms.ll Jim has okayed this off-list. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180799 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-30 20:04:37 +00:00
Manman Ren	2dc50d3067	TBAA: remove !tbaa from testing cases if not used. This will make it easier to turn on struct-path aware TBAA since the metadata format will change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180796 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-30 17:52:57 +00:00
Rafael Espindola	5b0ce3570c	Make all darwin ppc stubs local. This fixes pr15763. Patch by David Fang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180657 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-27 00:43:16 +00:00
Hal Finkel	87c1e42be7	Fix PPC optimizeCompareInstr swapped-sub argument handling When matching a compare with a subtract where the arguments of the compare are swapped w.r.t. the arguments of the subtract, we need to negate the predicates (or CR bit indices) of the users. This, however, is not the same as inverting the predicate (negating LT -> GT, but inverting LT -> GE, for example). The ARM backend seems to do this correctly, but when I adapted the code for the PPC backend, I introduced an error in this logic. Comparison optimization is now enabled again by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179899 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-19 22:08:38 +00:00
Hal Finkel	4029c3feed	Disable PPC comparison optimization by default This seems to cause a stage-2 LLVM compile failure (by crashing TableGen); do I'm disabling this for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179807 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-18 22:54:25 +00:00
Hal Finkel	860c08cad5	Implement optimizeCompareInstr for PPC Many PPC instructions have a so-called 'record form' which stores to a specific condition register the result of comparing the result of the instruction with zero (always as a signed comparison). For integer operations on PPC64, this is always a 64-bit comparison. This implementation is derived from the implementation in the ARM backend; there are some differences because PPC condition registers are allocatable virtual registers (although the record forms always use a specific one), and we look for a matching subtraction instruction after the compare (but before the first use) in addition to before it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179802 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-18 22:15:08 +00:00
Hal Finkel	fb6fe0aea2	Fix PPC64 CR spill location for callee-saved registers This fixes an ABI bug for non-Darwin PPC64. For the callee-saved condition registers, the spill location is specified relative to the stack pointer (SP + 8). However, this is not relative to the SP after the new stack frame is established, but instead relative to the caller's stack pointer (it is stored into the linkage area of the parent's stack frame). So, like with the link register, we don't directly spill the CRs with other callee-saved registers, but just mark them to be spilled during prologue generation. In practice, this reverts r179457 for PPC64 (but leaves it in place for PPC32). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179500 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-15 02:07:05 +00:00
Hal Finkel	b99c995825	Spill and restore PPC CR registers using the FP when we have one For functions that need to spill CRs, and have dynamic stack allocations, the value of the SP during the restore is not what it was during the save, and so we need to use the FP in these cases (as for all of the other spills and restores, but the CR restore has a special code path because its reserved slot, like the link register, is specified directly relative to the adjusted SP). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179457 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-13 08:09:20 +00:00
Nico Rieck	f89da7210b	Replace coff-/elf-dump with llvm-readobj git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179361 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-12 04:06:46 +00:00
Benjamin Kramer	8f6f47ec75	FileCheckize a bunch of tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179276 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-11 12:32:23 +00:00
Hal Finkel	9af7014019	Manually remove successors in if conversion when CopyAndPredicateBlock is used In the simple and triangle if-conversion cases, when CopyAndPredicateBlock is used because the to-be-predicated block has other predecessors, we need to explicitly remove the old copied block from the successors list. Normally if conversion relies on TII->AnalyzeBranch combined with BB->CorrectExtraCFGEdges to cleanup the successors list, but if the predicated block contained an un-analyzable branch (such as a now-predicated return), then this will fail. These extra successors were causing a problem on PPC because it was causing later passes (such as PPCEarlyReturm) to leave dead return-only basic blocks in the code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179227 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-10 22:05:25 +00:00
Hal Finkel	7eb0d8148e	Allow PPC B and BLR to be if-converted into some predicated forms This enables us to form predicated branches (which are the same conditional branches we had before) and also a larger set of predicated returns (including instructions like bdnzlr which is a conditional return and loop-counter decrement all in one). At the moment, if conversion does not capture all possible opportunities. A simple example is provided in early-ret2.ll, where if conversion forms one predicated return, and then the PPCEarlyReturn pass picks up the other one. So, at least for now, we'll keep both mechanisms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179134 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-09 22:58:37 +00:00
Hal Finkel	f6f8198d85	Use virtual base registers on PPC On PowerPC, non-vector loads and stores have r+i forms; however, in functions with large stack frames these were not being used to access slots far from the stack pointer because such slots were out of range for the signed 16-bit immediate offset field. This increases register pressure because we need a separate register for each offset (when the r+r form is used). By enabling virtual base registers, we can deal with large stack frames without unduly increasing register pressure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179105 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-09 17:27:09 +00:00
Hal Finkel	307b8535ee	Convert test PowerPC/2007-09-07-LoadStoreIdxForms to FileCheck git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179104 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-09 17:26:55 +00:00
Hal Finkel	5ee67e8e76	Generate PPC early conditional returns PowerPC has a conditional branch to the link register (return) instruction: BCLR. This should be used any time when we'd otherwise have a conditional branch to a return. This adds a small pass, PPCEarlyReturn, which runs just prior to the branch selection pass (and, importantly, after block placement) to generate these conditional returns when possible. It will also eliminate unconditional branches to returns (these happen rarely; most of the time these have already been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice optimization for small functions that do not maintain a stack frame. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179026 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-08 16:24:03 +00:00
Hal Finkel	59889f7f49	Cleanup and improve PPC fsel generation First, we should not cheat: fsel-based lowering of select_cc is a finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes this clear, as does a note in our own README). This also adds fsel-based lowering of EQ and NE condition codes. As it turned out, fsel generation was covered by a grand total of zero regression test cases. I've added some test cases to cover the existing behavior (which is now finite-math only), as well as the new EQ cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179000 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-07 22:11:09 +00:00
Hal Finkel	839b909653	Implement PPCInstrInfo::FoldImmediate There are certain PPC instructions into which we can fold a zero immediate operand. We can detect such cases by looking at the register class required by the using operand (so long as it is not otherwise constrained). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178961 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-06 19:30:30 +00:00
Hal Finkel	ff56d1a201	Enable early if conversion on PPC On cores for which we know the misprediction penalty, and we have the isel instruction, we can profitably perform early if conversion. This enables us to replace some small branch sequences with selects and avoid the potential stalls from mispredicting the branches. Enabling this feature required implementing canInsertSelect and insertSelect in PPCInstrInfo; isel code in PPCISelLowering was refactored to use these functions as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178926 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-05 23:29:01 +00:00
Hal Finkel	7530a9f7d1	PPC: Improve code generation for mixed-precision reciprocal sqrt The DAGCombine logic that recognized a/sqrt(b) and transformed it into a multiplication by the reciprocal sqrt did not handle cases where the sqrt and the division were separated by an fpext or fptrunc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178801 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-04 22:44:12 +00:00
Bill Schmidt	cd7a1558ed	Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-03 13:05:44 +00:00
Hal Finkel	827307b95f	Use PPC reciprocal estimates with Newton iteration in fast-math mode When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178617 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-03 04:01:11 +00:00
Bill Schmidt	debf7d345a	Fix PR15630: Replace faulty stdcx. with stwcx. When doing a partword atomic operation, a lwarx was being paired with a stdcx. instead of a stwcx. when compiling for a 64-bit target. The target has nothing to do with it in this case; we always need a stwcx. Thanks to Kai Nacke for reporting the problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178559 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-02 18:37:08 +00:00
Hal Finkel	a1646ceb9a	Fix a bad assert in PPCTargetLowering git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178489 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 18:42:58 +00:00
Hal Finkel	6c81b118ca	Add triple to test/CodeGen/PowerPC/stfiwx-2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178486 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 18:18:44 +00:00
Hal Finkel	4647919784	Add more PPC floating-point conversion instructions The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178480 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 17:52:07 +00:00
Hal Finkel	dc8efbae14	Fix PowerPC/cttz.ll to specify a cpu (and use FileCheck) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178472 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 16:31:56 +00:00
Hal Finkel	1fce88313e	Add the PPC popcntw instruction The popcntw instruction is available whenever the popcntd instruction is available, and performs a separate popcnt on the lower and upper 32-bits. Ignoring the high-order count, this can be used for the 32-bit input case (saving on the explicit zero extension otherwise required to use popcntd). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178470 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 15:58:15 +00:00
Hal Finkel	8049ab15e4	Add the PPC lfiwax instruction This instruction is available on modern PPC64 CPUs, and is now used to improve the SINT_TO_FP lowering (by eliminating the need for the separate sign extension instruction and decreasing the amount of needed stack space). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178446 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-31 10:12:51 +00:00
Hal Finkel	9ad0f4907b	Cleanup PPC(64) i32 -> float/double conversion The existing SINT_TO_FP code for i32 -> float/double conversion was disabled because it relied on broken EXTSW_32/STD_32 instruction definitions. The original intent had been to enable these 64-bit instructions to be used on CPUs that support them even in 32-bit mode. Unfortunately, this form of lying to the infrastructure was buggy (as explained in the FIXME comment) and had therefore been disabled. This re-enables this functionality, using regular DAG nodes, but only when compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead) are removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178438 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-31 01:58:02 +00:00
Hal Finkel	0882fd6c4f	Implement FRINT lowering on PPC using frin Like nearbyint, rint can be implemented on PPC using the frin instruction. The complication comes from the fact that rint needs to set the FE_INEXACT flag when the result does not equal the input value (and frin does not do that). As a result, we use a custom inserter which, after the rounding, compares the rounded value with the original, and if they differ, explicitly sets the XX bit in the FPSCR register (which corresponds to FE_INEXACT). Once LLVM has better modeling of the floating-point environment we should be able to (often) eliminate this extra complexity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178362 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-29 19:41:55 +00:00
Hal Finkel	f5d5c43460	Add PPC FP rounding instructions fri[mnpz] These instructions are available on the P5x (and later) and on the A2. They implement the standard floating-point rounding operations (floor, trunc, etc.). One caveat: frin (round to nearest) does not implement "ties to even", and so is only enabled in fast-math mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178337 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-29 08:57:48 +00:00
Hal Finkel	af0d148b20	Specify CPUs on the PPC bswap-load-store test Otherwise, the CHECK-NOT's might trigger depending on the host's CPU. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178287 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 20:35:18 +00:00
Hal Finkel	2544f221c5	Only enable 64-bit bswap DAG combines for PPC64 Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were added for bswap with load/store. This is because these combines are really only valid in 64-bit mode, regardless of the CPU (and this was not being checked). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178286 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 20:23:46 +00:00
Hal Finkel	efdd4673d6	Add the PPC64 ldbrx/stdbrx instructions These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178276 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 19:25:55 +00:00
Hal Finkel	c53ab4d77f	Add the PPC64 popcntd instruction PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178233 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 13:29:47 +00:00
Hal Finkel	d957f957ee	Cleanup PPC CR-spill kill flags and 32- vs. 64-bit instructions There were a few places where kill flags were not being set correctly, and where 32-bit instruction variants were being used with 64-bit registers. After r178180, this code was being triggered causing llc to assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178220 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 03:38:16 +00:00
Hal Finkel	32e12df253	Print PPC ZERO as 0 (not r0) even on Darwin It seems that the Darwin PPC assembler requires r0 to be written as 0 when it means 0 (at least in lwarx/stwcx.). Fixes PR15605. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178142 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-27 13:20:52 +00:00

1 2 3 4 5 ...

767 Commits