llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-24 12:29:33 +00:00

Author	SHA1	Message	Date
Kevin Qin	6739735812	Revert "r214832 - MachineCombiner Pass for selecting faster instruction" It broke compiling of most Benchmark and internal test, as clang got clashed by segmentation fault or assertion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214845 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-05 05:43:47 +00:00
Juergen Ributzka	7e9c0bc511	[FastISel][AArch64] Don't perform sign-/zero-extension for function arguments that have already been sign-/zero-extended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214844 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-05 05:43:44 +00:00
Gerolf Hoflehner	c2328d552c	MachineCombiner Pass for selecting faster instruction sequence on AArch64 Re-commit of r214669 without changes to test cases LLVM::CodeGen/AArch64/arm64-neon-mul-div.ll and LLVM:: CodeGen/AArch64/dp-3source.ll This resolves the reported compfails of the original commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214832 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-05 01:16:13 +00:00
Juergen Ributzka	2c68cde701	[FastISel][AArch64] Fix shift lowering for i8 and i16 value types. This fix changes the parameters #r and #s that are passed to the UBFM/SBFM instruction to get the zero/sign-extension for free. The original problem was that the shift left would use the 32-bit shift even for i8/i16 value types, which could leave the upper bits set with "garbage" values. The arithmetic shift right on the other side would use the wrong MSB as sign-bit to determine what bits to shift into the value. This fixes <rdar://problem/17907720>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214788 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-04 21:49:51 +00:00
Chad Rosier	82c93451f3	[AArch64] Extend the number of scalar instructions supported in the AdvSIMD scalar integer instruction pass. This is a patch I had lying around from a few months ago. The pass is currently disabled by default, so nothing to interesting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214779 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-04 21:20:25 +00:00
Kevin Qin	534100b31e	Revert "r214669 - MachineCombiner Pass for selecting faster instruction" This commit broke "make check" for several hours, so get it reverted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214697 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-04 05:10:33 +00:00
Gerolf Hoflehner	48e1bd7287	MachineCombiner Pass for selecting faster instruction sequence - AArch64 target support This patch turns off madd/msub generation in the DAGCombiner and generates them in the MachineCombiner instead. It replaces the original code sequence with the combined sequence when it is beneficial to do so. When there is no machine model support it always generates the madd/msub instruction. This is true also when the objective is to optimize for code size: when the combined sequence is shorter is always chosen and does not get evaluated. When there is a machine model the combined instruction sequence is evaluated for critical path and resource length using machine trace metrics and the original code sequence is replaced when it is determined to be faster. rdar://16319955 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214669 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-03 22:03:40 +00:00
James Molloy	14d596a382	Update test to use a more modern AArch64 triple, as requested by Renato. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214637 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-02 17:15:11 +00:00
James Molloy	e411c38de9	[AArch64] Teach DAGCombiner that converting two consecutive loads into a vector load is not a good transform when paired loads are available. The combiner was creating Q-register loads and stores, which then had to be spilled because there are no callee-save Q registers! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-02 14:51:24 +00:00
Juergen Ributzka	8d3bf10dd3	[FastISel][AArch64] Fold offset into the memory operation. Fold simple offsets into the memory operation: add x0, x0, #8 ldr x0, [x0] --> ldr x0, [x0, #8] Fixes <rdar://problem/17887945>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214545 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-01 19:40:16 +00:00
Juergen Ributzka	3d253a3f80	[FastISel][AArch64] Add branch weights. Add branch weights to branch instructions, so that the following passes can optimize based on it (i.e. basic block ordering). Fixes <rdar://problem/17887137>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214537 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-01 18:39:24 +00:00
Chad Rosier	b4cded4926	[AArch64] Fix test from r214518 in an attempt to appease buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214521 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-01 15:30:41 +00:00
Chad Rosier	4175e2f597	[AArch64] Generate tbz/tbnz when comparing against zero. The tbz/tbnz checks the sign bit to convert op w1, w1, w10 cmp w1, #0 b.lt .LBB0_0 to op w1, w1, w10 tbnz w1, #31, .LBB0_0 Differential Revision: http://reviews.llvm.org/D4440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214518 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-01 14:48:56 +00:00
Juergen Ributzka	74ac16386b	[FastISel][AArch64] Fix the immediate versions of the {s\|u}{add\|sub}.with.overflow intrinsics. ADDS and SUBS cannot encode negative immediates or immediates larger than 12bit. This fix checks if the immediate version can be used under this constraints and if we can convert ADDS to SUBS or vice versa to support negative immediates. Also update the test cases to test the immediate versions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214470 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-01 01:25:55 +00:00
Juergen Ributzka	95ec2761f3	[FastISel][AArch64] Add basic bitcast support for conversion between float and int. Fixes <rdar://problem/17867078>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214389 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-31 06:25:37 +00:00
Juergen Ributzka	a60ed423a6	[FastISel][AArch64] Add sqrt intrinsic support. Fixes <rdar://problem/17867067>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214388 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-31 06:25:33 +00:00
Juergen Ributzka	e482ebc147	[FastISel][AArch64] Update and enable patchpoint and stackmap intrinsic tests for FastISel. This commit updates the existing SelectionDAG tests for the stackmap and patchpoint intrinsics and enables FastISel testing. It also splits up the tests into separate files, due to different codegen between SelectionDAG and FastISel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214382 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-31 04:10:43 +00:00
Juergen Ributzka	e3a75015d7	[FastISel][AArch64] Add MachO large code model support for function calls. Currently the large code model for MachO uses the GOT to make function calls. Emit the required adrp and ldr instructions to load the address from the GOT. Related to <rdar://problem/17733076>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214381 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-31 04:10:40 +00:00
Juergen Ributzka	cb99212bc1	[FastISel][AArch64] Add select folding support for the XALU intrinsics. This improves the code generation for the XALU intrinsics when the condition is feeding a select instruction. This also updates and enables the XALU unit tests for FastISel. This fixes <rdar://problem/17831117>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214350 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-30 22:04:37 +00:00
Juergen Ributzka	b0dba10fa6	[FastISel][AArch64] Add support for shift-immediate. Currently the shift-immediate versions are not supported by tblgen and hopefully this can be later removed, once the required support has been added to tblgen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214345 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-30 22:04:22 +00:00
Jiangning Liu	a3b8caf8a7	Implement AArch64 TTI interface isAsCheapAsAMove. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214159 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-29 02:09:26 +00:00
Tim Northover	54a7f7f9e0	AArch64: fix conversion of 'J' inline asm constraints. 'J' represents a negative number suitable for an add/sub alias instruction, but while preparing it to become an int64_t we were mangling the sign extension. So "i32 -1" became 0xffffffffLL, for example. Should fix one half of PR20456. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214052 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-27 07:10:29 +00:00
Akira Hatanaka	0651a556fe	[stack protector] Fix a potential security bug in stack protector where the address of the stack guard was being spilled to the stack. Previously the address of the stack guard would get spilled to the stack if it was impossible to keep it in a register. This patch introduces a new target independent node and pseudo instruction which gets expanded post-RA to a sequence of instructions that load the stack guard value. Register allocator can now just remat the value when it can't keep it in a register. <rdar://problem/12475629> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213967 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-25 19:31:34 +00:00
Juergen Ributzka	06640d93e0	[FastISel][AArch64] Add support for frameaddress intrinsic. This commit implements the frameaddress intrinsic for the AArch64 architecture in FastISel. There were two test cases that pretty much tested the same, so I combined them to a single test case. Fixes <rdar://problem/17811834> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213959 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-25 17:47:14 +00:00
Chandler Carruth	d24d326705	[SDAG] Introduce a combined set to the DAG combiner which tracks nodes which have successfully round-tripped through the combine phase, and use this to ensure all operands to DAG nodes are visited by the combiner, even if they are only added during the combine phase. This is critical to have the combiner reach nodes that are introduced during combining. Previously these would sometimes be visited and sometimes not be visited based on whether they happened to end up on the worklist or not. Now we always run them through the combiner. This fixes quite a few bad codegen test cases lurking in the suite while also being more principled. Among these, the TLS codegeneration is particularly exciting for programs that have this in the critical path like TSan-instrumented binaries (although I think they engineer to use a different TLS that is faster anyways). I've tried to check for compile-time regressions here by running llc over a merged (but not LTO-ed) clang bitcode file and observed at most a 3% slowdown in llc. Given that this is essentially a worst case (none of opt or clang are running at this phase) I think this is tolerable. The actual LTO case should be even less costly, and the cost in normal compilation should be negligible. With this combining logic, it is possible to re-legalize as we combine which is necessary to implement PSHUFB formation on x86 as a post-legalize DAG combine (my ultimate goal). Differential Revision: http://reviews.llvm.org/D4638 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213898 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-24 22:15:28 +00:00
Kevin Qin	2daff76c05	[AArch64] Fix a bug generating incorrect instruction when building small vector. This bug is introduced by r211144. The element of operand may be smaller than the element of result, but previous commit can only handle the contrary condition. This commit is to handle this scenario and generate optimized codes like ZIP1. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213830 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-24 02:05:42 +00:00
Jiangning Liu	1bc34d71b7	[AArch64] Disable some optimization cases for type conversion from sint to fp, because those optimization cases are micro-architecture dependent and only make sense for Cyclone. A new predicate Cyclone is introduced in .td file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213827 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-24 01:29:59 +00:00
Jim Grosbach	8c0cf3e110	Use an explicit triple in testcase. Make the test work better on non-darwin hosts. Hopefully. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213801 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 20:46:32 +00:00
Jim Grosbach	adb6a3649e	[X86,AArch64] Extend vcmp w/ unary op combine to work w/ more constants. The transform to constant fold unary operations with an AND across a vector comparison applies when the constant is not a splat of a scalar as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213800 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 20:41:43 +00:00
Jim Grosbach	4070037e3c	X86: restrict combine to when type sizes are safe. The folding of unary operations through a vector compare and mask operation is only safe if the unary operation result is of the same size as its input. For example, it's not safe for [su]itofp from v4i32 to v4f64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213799 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 20:41:38 +00:00
Juergen Ributzka	4fa6ecc26f	[FastISel][AArch64] Fix return type in FastLowerCall. I used the wrong method to obtain the return type inside FinishCall. This fix simply uses the return type from FastLowerCall, which we already determined to be a valid type. Reduced test case from Chad. Thanks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213788 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 20:03:13 +00:00
Chad Rosier	67c325e9f0	[AArch64] Lower sdiv x, pow2 using add + select + shift. The target-independent DAGcombiner will generate: asr w1, X, #31 w1 = splat sign bit. add X, X, w1, lsr #28 X = X + 0 or pow2-1 asr w0, X, asr #4 w0 = X/pow2 However, the add + shifts is expensive, so generate: add w0, X, 15 w0 = X + pow2-1 cmp X, wzr X - 0 csel X, w0, X, lt X = (X < 0) ? X + pow2-1 : X; asr w0, X, asr 4 w0 = X/pow2 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213758 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 14:57:52 +00:00
Tim Northover	8b6257629a	AArch64: remove "arm64_be" support in favour of "aarch64_be". There really is no arm64_be: it was a useful fiction to test big-endian support while both backends existed in parallel, but now the only platform that uses the name (iOS) doesn't have a big-endian variant, let alone one called "arm64_be". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213748 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 12:58:11 +00:00
Juergen Ributzka	7edf396977	[FastIsel][AArch64] Add support for the FastLowerCall and FastLowerIntrinsicCall target-hooks. This commit modifies the existing call lowering functions to be used as the FastLowerCall and FastLowerIntrinsicCall target-hooks instead. This enables patchpoint intrinsic lowering for AArch64. This fixes <rdar://problem/17733076> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213704 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-22 23:14:58 +00:00
Juergen Ributzka	be8c68d72d	[AArch64] Use CHECK-LABEL in ARM64 ABI unit tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213703 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-22 23:14:54 +00:00
Tim Northover	f8d927f22b	CodeGen: emit IR-level f16 conversion intrinsics as fptrunc/fpext This makes the first stage DAG for @llvm.convert.to.fp16 an fptrunc, and correspondingly @llvm.convert.from.fp16 an fpext. The legalisation path is now uniform, regardless of the input IR: fptrunc -> FP_TO_FP16 (if f16 illegal) -> libcall fpext -> FP16_TO_FP (if f16 illegal) -> libcall Each target should be able to select the version that best matches its operations and not be required to duplicate patterns for both fptrunc and FP_TO_FP16 (for example). As a result we can remove some redundant AArch64 patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213507 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-21 09:13:56 +00:00
Tim Northover	e72ff8829e	AArch64: implement efficient f16 bitcasts Because i16 is illegal, there's no native DAG method to represent a bitcast to or from an f16 type. This meant LLVM was inserting a stack store/load pair which is really not ideal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213378 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 13:07:05 +00:00
Tim Northover	1a8bcdb72e	AArch64: support f16 extend/trunc operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213375 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 13:01:31 +00:00
Tim Northover	0afed03229	CodeGen: soften f16 type by default instead of marking legal. Actual support for softening f16 operations is still limited, and can be added when it's needed. But Soften is much closer to being a useful thing to try than keeping it Legal when no registers can actually hold such values. Longer term, we probably want something between Soften and Promote semantics for most targets, it'll be more efficient to promote the 4 basic operations to f32 than libcall them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 12:41:46 +00:00
Jim Grosbach	f4e104f5eb	AArch64: Constant fold converting vector setcc results to float. Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can move unary operations, in this case [su]int_to_fp through the mask operation and constant fold the operation away. Generally speaking: UNARYOP(AND(VECTOR_CMP(x,y), constant)) --> AND(VECTOR_CMP(x,y), constant2) where constant2 is UNARYOP(constant). This implements the transform where UNARYOP is [su]int_to_fp. For example, consider the simple function: define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind { %cmp = fcmp oeq <4 x float> %val, %test %ext = zext <4 x i1> %cmp to <4 x i32> %result = sitofp <4 x i32> %ext to <4 x float> ret <4 x float> %result } Before this change, the code is generated as: fcmeq.4s v0, v0, v1 movi.4s v1, #0x1 // Integer splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. scvtf.4s v0, v0 // Convert each lane to f32. ret After, the code is improved to: fcmeq.4s v0, v0, v1 fmov.4s v1, #1.00000000 // f32 splat value. and.16b v0, v0, v1 // Mask lanes based on the comparison. ret The svvtf.4s has been constant folded away and the floating point 1.0f vector lanes are materialized directly via fmov.4s. Rather than do the folding manually in the target code, teach getNode() in the generic SelectionDAG to handle folding constant operands of vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do additional constant folding there as well, but I don't have test cases for those operations, so leaving them for another time when it becomes appropriate. rdar://17693791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213341 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 00:40:52 +00:00
Tim Northover	3e61ccdded	CodeGen: extend f16 conversions to permit types > float. This makes the two intrinsics @llvm.convert.from.f16 and @llvm.convert.to.f16 accept types other than simple "float". This is only strictly needed for the truncate operation, since otherwise double rounding occurs and there's no way to represent the strict IEEE conversion. However, for symmetry we allow larger types in the extend too. During legalization, we can expand an "fp16_to_double" operation into two extends for convenience, but abort when the truncate isn't legal. A new libcall is probably needed here. Even after this commit, various target tweaks are needed to actually use the extended intrinsics. I've put these into separate commits for clarity, so there are no actual tests of f64 conversion here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:51:23 +00:00
Yi Kong	f33a30cdd0	Port memory barriers intrinsics to AArch64 Memory barrier __builtin_arm_[dmb, dsb, isb] intrinsics are required to implement their corresponding ACLE and MSVC intrinsics. This patch ports ARM dmb, dsb, isb intrinsic to AArch64. Differential Revision: http://reviews.llvm.org/D4520 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213247 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 10:50:20 +00:00
Tim Northover	fbb631183a	AArch64: fall back to generic code for out of range extract/insert. rdar://problem/17624784 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213059 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-15 10:00:26 +00:00
Tim Northover	26012cec89	AArch64: remove unnecessary pseudo-instruction. Sufficiently twisted use of TableGen lets us write patterns directly for f16 (as an i16 promoted to i32) -> f32 conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212933 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-14 11:16:02 +00:00
Saleem Abdulrasool	01c06d7954	AArch64: add support for llvm.aarch64.hint intrinsic This adds a llvm.aarch64.hint intrinsic to mirror the llvm.arm.hint in order to support the various hint intrinsic functions in the ACLE. Add an optional pattern field that permits the subclass to specify the pattern that matches the selection. The intrinsic pattern is set as mayLoad, mayStore, so overload the value for the definition of the hint instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212883 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-12 21:20:49 +00:00
Oliver Stannard	cb047f2a74	ARM: Allow __fp16 as a function arg or return type for AArch64 ACLE 2.0 allows __fp16 to be used as a function argument or return type. This enables this for AArch64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212812 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-11 13:33:46 +00:00
Tim Northover	6b0ac2aa02	AArch64: correctly fast-isel i8 & i16 multiplies We were asking for a register for type i8 or i16 which caused an assert. rdar://problem/17620015 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212718 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 14:18:46 +00:00
Hao Liu	a3c15c19b8	[AArch64]Fix an assertion failure in DAG Combiner about concating 2 build_vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212677 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-10 03:41:50 +00:00
Jim Grosbach	a3edd6a038	AArch64: Better codegen for storing to __fp16. Storing will generally be immediately preceded by rounding from an f32 or f64, so make sure to match those patterns directly to convert into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-from-f64 path which was first converting to f32 and then to f16 from there. rdar://17594379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212638 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-09 18:55:52 +00:00
Jim Grosbach	05bb7c5045	AArch64: Better codegen for loading from __fp16. Loading will generally extend to an f32 or an 64, so make sure to match those patterns directly to load into the FPR16 register class directly rather than going through the integer GPRs. This also eliminates an extra step in the convert-to-f64 path which was first converting to f32 and then to f64 from there. rdar://17594379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212573 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-08 23:28:48 +00:00

1 2 3 4 5 ...

393 Commits