llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-28 22:29:56 +00:00

Author	SHA1	Message	Date
Bill Schmidt	cd7a1558ed	Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC. For this we need to use a libcall. Previously LLVM didn't implement libcall support for frem, so I've added it in the usual straightforward manner. A test case from the bug report is included. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-03 13:05:44 +00:00
Hal Finkel	827307b95f	Use PPC reciprocal estimates with Newton iteration in fast-math mode When unsafe FP math operations are enabled, we can use the fre[s] and frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together with some Newton iteration, in order to quickly generate floating-point division and sqrt results. All of these instructions are separately optional, and so each has its own feature flag (except for the Altivec instructions, which are covered under the existing Altivec flag). Doing this is not only faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these computations to be pipelined with other computations in order to hide their overall latency. I've also added a couple of missing fnmsub patterns which turned out to be missing (but are necessary for good code generation of the Newton iterations). Altivec needs a similar fix, but that will probably be more complicated because fneg is expanded for Altivec's v4f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178617 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-03 04:01:11 +00:00
Bill Schmidt	debf7d345a	Fix PR15630: Replace faulty stdcx. with stwcx. When doing a partword atomic operation, a lwarx was being paired with a stdcx. instead of a stwcx. when compiling for a 64-bit target. The target has nothing to do with it in this case; we always need a stwcx. Thanks to Kai Nacke for reporting the problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178559 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-02 18:37:08 +00:00
Hal Finkel	2a401959b9	Fix typo in PPCISelLowering Thanks to Bill Schmidt for finding this in review of r178480. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178521 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-02 03:29:51 +00:00
Hal Finkel	a1646ceb9a	Fix a bad assert in PPCTargetLowering git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178489 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 18:42:58 +00:00
Hal Finkel	4647919784	Add more PPC floating-point conversion instructions The P7 and A2 have additional floating-point conversion instructions which allow a direct two-instruction sequence (plus load/store) to convert from all combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores, only some combinations were directly available). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178480 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 17:52:07 +00:00
Hal Finkel	1fce88313e	Add the PPC popcntw instruction The popcntw instruction is available whenever the popcntd instruction is available, and performs a separate popcnt on the lower and upper 32-bits. Ignoring the high-order count, this can be used for the 32-bit input case (saving on the explicit zero extension otherwise required to use popcntd). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178470 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 15:58:15 +00:00
Hal Finkel	f170cc9b2e	Treat PPCISD::STFIWX like the memory opcode that it is PPCISD::STFIWX is really a memory opcode, and so it should come after FIRST_TARGET_MEMORY_OPCODE, and we should use DAG.getMemIntrinsicNode to create nodes using it. No functionality change intended (although there could be optimization benefits from preserving the MMO information). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178468 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-01 15:37:53 +00:00
Hal Finkel	8049ab15e4	Add the PPC lfiwax instruction This instruction is available on modern PPC64 CPUs, and is now used to improve the SINT_TO_FP lowering (by eliminating the need for the separate sign extension instruction and decreasing the amount of needed stack space). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178446 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-31 10:12:51 +00:00
Hal Finkel	9ad0f4907b	Cleanup PPC(64) i32 -> float/double conversion The existing SINT_TO_FP code for i32 -> float/double conversion was disabled because it relied on broken EXTSW_32/STD_32 instruction definitions. The original intent had been to enable these 64-bit instructions to be used on CPUs that support them even in 32-bit mode. Unfortunately, this form of lying to the infrastructure was buggy (as explained in the FIXME comment) and had therefore been disabled. This re-enables this functionality, using regular DAG nodes, but only when compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead) are removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178438 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-31 01:58:02 +00:00
Hal Finkel	0882fd6c4f	Implement FRINT lowering on PPC using frin Like nearbyint, rint can be implemented on PPC using the frin instruction. The complication comes from the fact that rint needs to set the FE_INEXACT flag when the result does not equal the input value (and frin does not do that). As a result, we use a custom inserter which, after the rounding, compares the rounded value with the original, and if they differ, explicitly sets the XX bit in the FPSCR register (which corresponds to FE_INEXACT). Once LLVM has better modeling of the floating-point environment we should be able to (often) eliminate this extra complexity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178362 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-29 19:41:55 +00:00
Benjamin Kramer	74a4533a42	Remove the old CodePlacementOpt pass. It was superseded by MachineBlockPlacement and disabled by default since LLVM 3.1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178349 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-29 17:14:24 +00:00
Hal Finkel	f5d5c43460	Add PPC FP rounding instructions fri[mnpz] These instructions are available on the P5x (and later) and on the A2. They implement the standard floating-point rounding operations (floor, trunc, etc.). One caveat: frin (round to nearest) does not implement "ties to even", and so is only enabled in fast-math mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178337 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-29 08:57:48 +00:00
Hal Finkel	2544f221c5	Only enable 64-bit bswap DAG combines for PPC64 Compiling in 32-bit mode on a P7 would assert after 64-bit DAG combines were added for bswap with load/store. This is because these combines are really only valid in 64-bit mode, regardless of the CPU (and this was not being checked). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178286 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 20:23:46 +00:00
Hal Finkel	b52980be07	Fix bad indentation in r178276 Thanks to Bill Schmidt for pointing this out! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178280 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 19:43:12 +00:00
Hal Finkel	efdd4673d6	Add the PPC64 ldbrx/stdbrx instructions These are 64-bit load/store with byte-swap, and available on the P7 and the A2. Like the similar instructions for 16- and 32-bit words, these are matched in the target DAG-combine phase against load/store-bswap pairs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178276 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 19:25:55 +00:00
Hal Finkel	c53ab4d77f	Add the PPC64 popcntd instruction PPC ISA 2.06 (P7, A2, etc.) has a popcntd instruction. Add this instruction and tell TTI about it so that popcount-loop recognition will know about it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178233 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 13:29:47 +00:00
Hal Finkel	e915047fed	Fix typo (common to both X86 and PPC) Thanks to Bill Schmidt for pointing this out during code review! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178170 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-27 19:10:42 +00:00
Ulrich Weigand	7d35d3f432	PowerPC: Simplify FADD in round-to-zero mode. As part of the the sequence generated to implement long double -> int conversions, we need to perform an FADD in round-to-zero mode. This is problematical since the FPSCR is not at all modeled at the SelectionDAG level, and thus there is a risk of getting floating point instructions generated out of sequence with the instructions to modify FPSCR. The current code handles this by somewhat "special" patterns that in part have dummy operands, and/or duplicate existing instructions, making them awkward to handle in the asm parser. This commit changes this by leaving the "FADD in round-to-zero mode" as an atomic operation on the SelectionDAG level, and only split it up into real instructions at the MI level (via custom inserter). Since at this level the FPSCR is modeled (via the "RM" hard register), much of the "special" stuff can just go away, and the resulting patterns can be used by the asm parser. No significant change in generated code expected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178006 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-26 10:56:22 +00:00
Ulrich Weigand	a01c7dbaab	PowerPC: Use CCBITRC operand for ISEL patterns. This commit changes the ISEL patterns to use a CCBITRC operand instead of a "pred" operand. This matches the actual instruction text more directly, and simplifies use of ISEL with the asm parser. In addition, this change allows some simplification of handling the "pred" operand, as this is now only used by BCC. No change in generated code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178003 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-26 10:54:54 +00:00
Ulrich Weigand	86765fbe17	Remove ABI-duplicated call instruction patterns. We currently have a duplicated set of call instruction patterns depending on the ABI to be followed (Darwin vs. Linux). This is a bit odd; while the different ABIs will result in different instruction sequences, the actual instructions themselves ought to be independent of the ABI. And in fact it turns out that the only nontrivial difference between the two sets of patterns is that in the PPC64 Linux ABI, the instruction used for indirect calls is marked to take X11 as extra input register (which is indeed used only with that ABI to hold an incoming environment pointer for nested functions). However, this does not need to be hard-coded at the .td pattern level; instead, the C++ code expanding calls can simply add that use, just like it adds uses for argument registers anyway. No change in generated code expected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177735 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-22 15:24:13 +00:00
Ulrich Weigand	881a7154b9	Fix swapped BasePtr and Offset in pre-inc memory addresses. PPCTargetLowering::getPreIndexedAddressParts currently provides the base part of a memory address in the offset result, and the offset part in the base result. That swap is then undone again when an MI instruction is generated (in PPCDAGToDAGISel::Select for loads, and using .md Pat patterns for stores). This patch reverts this double swap, to make common code and back-end be in sync as to which part of the address is base and which is offset. To avoid performance regressions in certain cases, target code now checks whether the choice of base register would be rejected for pre-inc accesses by common code, and attempts to swap base and offset again in such cases. (Overall, this means that now pre-ice accesses are generated more frequently than before.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177733 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-22 14:58:48 +00:00
Hal Finkel	7697370adf	Remove the G8RC_NOX0_and_GPRC_NOR0 PPC register class As Jakob pointed out in his review of r177423, having a shared ZERO register between the 32- and 64-bit register classes causes this odd G8RC_NOX0_and_GPRC_NOR0 class to be created. As recommended, this adds a ZERO8 register which differentiates the 32- and 64-bit zeros. No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177683 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-21 23:45:03 +00:00
Hal Finkel	7ee74a663a	Implement builtin_{setjmp/longjmp} on PPC This implements SJLJ lowering on PPC, making the Clang functions __builtin_{setjmp/longjmp} functional on PPC platforms. The implementation strategy is similar to that on X86, with the exception that a branch-and-link variant is used to get the right jump address. Credit goes to Bill Schmidt for suggesting the use of the unconditional bcl form (instead of the regular bl instruction) to limit return-address-cache pollution. Benchmarking the speed at -O3 of: static jmp_buf env_sigill; void foo() { __builtin_longjmp(env_sigill,1); } main() { ... for (int i = 0; i < c; ++i) { if (__builtin_setjmp(env_sigill)) { goto done; } else { foo(); } done:; } ... } vs. the same code using the libc setjmp/longjmp functions on a P7 shows that this builtin implementation is ~4x faster with Altivec enabled and ~7.25x faster with Altivec disabled. This comparison is somewhat unfair because the libc version must also save/restore the VSX registers which we don't yet support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177666 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-21 21:37:52 +00:00
Hal Finkel	e9cc0a09ae	Correct PPC FRAMEADDR lowering using a pseudo-register The old code used to lower FRAMEADDR tried to replicate the logic in the real frame-lowering code that determines whether or not the frame pointer (r31) will be used. When it seemed as through the frame pointer would not be used, the stack pointer (r1) was used instead. Unfortunately, because the stack size is not yet known, this does not work. Instead, this change introduces new always-reserved pseudo-registers (FP and FP8) that are replaced during prologue insertion with the real frame-pointer register (either r1 or r31). It is important that this intrinsic always return a valid frame address because it is used by Clang to store the frame address as part of code generation for __builtin_setjmp. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177653 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-21 19:03:19 +00:00
Hal Finkel	a548afc98f	Prepare to make r0 an allocatable register on PPC Currently the PPC r0 register is unconditionally reserved. There are two reasons for this: 1. r0 is treated specially (as the constant 0) by certain instructions, and so cannot be used with those instructions as a regular register. 2. r0 is used as a temporary register in the CR-register spilling process (where, under some circumstances, we require two GPRs). This change addresses the first reason by introducing a restricted register class (without r0) for use by those instructions that treat r0 specially. These register classes have a new pseudo-register, ZERO, which represents the r0-as-0 use. This has the side benefit of making the existing target code simpler (and easier to understand), and will make it clear to the register allocator that uses of r0 as 0 don't conflict will real uses of the r0 register. Once the CR spilling code is improved, we'll be able to allocate r0. Adding these extra register classes, for some reason unclear to me, causes requests to the target to copy 32-bit registers to 64-bit registers. The resulting code seems correct (and causes no test-suite failures), and the new test case covers this new kind of asymmetric copy. As r0 is still reserved, no functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177423 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-19 18:51:05 +00:00
Hal Finkel	08a215c286	Fix PPC unaligned 64-bit loads and stores PPC64 supports unaligned loads and stores of 64-bit values, but in order to use the r+i forms, the offset must be a multiple of 4. Unfortunately, this cannot always be determined by examining the immediate itself because it might be available only via a TOC entry. In order to get around this issue, we additionally predicate the selection of the r+i form on the alignment of the load or store (forcing it to be at least 4 in order to select the r+i form). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177338 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-18 23:00:58 +00:00
Hal Finkel	2d37f7b979	Enable unaligned memory access on PPC for scalar types Unaligned access is supported on PPC for non-vector types, and is generally more efficient than manually expanding the loads and stores. A few of the existing test cases were using expanded unaligned loads and stores to test other features (like load/store with update), and for these test cases, unaligned access remains disabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@177160 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-15 15:27:13 +00:00
Benjamin Kramer	3853f74aba	ArrayRefize some code. No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@176648 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-07 20:33:29 +00:00
Eli Bendersky	700ed80d3d	Move the eliminateCallFramePseudoInstr method from TargetRegisterInfo to TargetFrameLowering, where it belongs. Incidentally, this allows us to delete some duplicated (and slightly different!) code in TRI. There are potentially other layering problems that can be cleaned up as a result, or in a similar manner. The refactoring was OK'd by Anton Korobeynikov on llvmdev. Note: this touches the target interfaces, so out-of-tree targets may be affected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175788 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-21 20:05:00 +00:00
Jim Grosbach	3450f800aa	Update TargetLowering ivars for name policy. http://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly ivars should be camel-case and start with an upper-case letter. A few in TargetLowering were starting with a lower-case letter. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175667 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-20 21:13:59 +00:00
Bill Schmidt	abc402886e	Additional fixes for bug 15155. This handles the cases where the 6-bit splat element is odd, converting to a three-instruction sequence to add or subtract two splats. With this fix, the XFAIL in test/CodeGen/PowerPC/vec_constants.ll is removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175663 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-20 20:41:42 +00:00
Bill Schmidt	49deebb5eb	Fix bug 14779 for passing anonymous aggregates [patch by Kai Nacke]. The PPC backend doesn't handle these correctly. This patch uses logic similar to that in the X86 and ARM backends to track these arguments properly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175635 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-20 17:31:41 +00:00
Bill Schmidt	b34c79e4bb	Fix PR15155: lost vadd/vsplat optimization. During lowering of a BUILD_VECTOR, we look for opportunities to use a vector splat. When the splatted value fits in 5 signed bits, a single splat does the job. When it doesn't fit in 5 bits but does fit in 6, and is an even value, we can splat on half the value and add the result to itself. This last optimization hasn't been working recently because of improved constant folding. To circumvent this, create a pseudo VADD_SPLAT that can be expanded during instruction selection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175632 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-20 15:50:31 +00:00
Bill Schmidt	212af6af02	PPC calling convention cleanup. Most of PPCCallingConv.td is used only by the 32-bit SVR4 ABI. Rename things to clarify this. Also delete some code that's been commented out for a long time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174526 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-06 17:33:58 +00:00
Jakob Stoklund Olesen	6ab5061a2c	Move MRI liveouts to PowerPC return instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174409 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 18:12:00 +00:00
Benjamin Kramer	0d3731478e	Disable a couple more vector splat optimizations on PPC. I didn't see those because the test case used "not grep". FileCheck the test and XFAIL it, preserving the old optimization, so this can be fixed eventually. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174330 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:52:32 +00:00
Benjamin Kramer	4969310052	SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors. This required disabling a PowerPC optimization that did the following: input: x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16> lowered to: tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8> x = ADD tmp, tmp The add now gets folded immediately and we're back at the BUILD_VECTOR we started from. I don't see a way to fix this currently so I left it disabled for now. Fix some trivially foldable X86 tests too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174325 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:19:18 +00:00
Evan Cheng	8688a58c53	Teach SDISel to combine fsin / fcos into a fsincos node if the following conditions are met: 1. They share the same operand and are in the same BB. 2. Both outputs are used. 3. The target has a native instruction that maps to ISD::FSINCOS node or the target provides a sincos library call. Implemented the generic optimization in sdisel and enabled it for Mac OSX. Also added an additional optimization for x86_64 Mac OSX by using an alternative entry point __sincos_stret which returns the two results in xmm0 / xmm1. rdar://13087969 PR13204 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173755 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-29 02:32:37 +00:00
Chandler Carruth	0b8c9a80f2	Move all of the header files which are involved in modelling the LLVM IR into their new header subdirectory: include/llvm/IR. This matches the directory structure of lib, and begins to correct a long standing point of file layout clutter in LLVM. There are still more header files to move here, but I wanted to handle them in separate commits to make tracking what files make sense at each layer easier. The only really questionable files here are the target intrinsic tablegen files. But that's a battle I'd rather not fight today. I've updated both CMake and Makefile build systems (I think, and my tests think, but I may have missed something). I've also re-sorted the includes throughout the project. I'll be committing updates to Clang, DragonEgg, and Polly momentarily. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171366 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-02 11:36:10 +00:00
Bill Wendling	831737d329	Remove the Function::getFnAttributes method in favor of using the AttributeSet directly. This is in preparation for removing the use of the 'Attribute' class as a collection of attributes. That will shift to the AttributeSet class instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171253 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-30 10:32:01 +00:00
Hal Finkel	cd9ea51986	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171072 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 17:22:53 +00:00
Benjamin Kramer	91223a41ef	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170551 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 15:49:14 +00:00
Bill Wendling	034b94b170	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170502 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:18:57 +00:00
Bill Schmidt	b453e16855	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170209 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-14 17:02:38 +00:00
Bill Schmidt	1e18b86192	This is another cleanup patch for 64-bit PowerPC TLS processing. I had some hackery in place that hid my poor use of TblGen, which I've now sorted out and cleaned up. No change in observable behavior, so no new test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170149 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-13 20:57:10 +00:00
Bill Schmidt	dfebc4cc4c	This is just a clean-up patch that simplifies the initial-exec TLS logic by avoiding use of machine operand flags. No change in observable behavior, so no new test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170141 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-13 18:45:54 +00:00
Bill Schmidt	349c2787cf	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170003 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 19:29:35 +00:00
Evan Cheng	946a3a9f22	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169959 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 02:34:41 +00:00
Evan Cheng	7d34267df6	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169954 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 01:32:07 +00:00

1 2 3 4 5 ...

733 Commits