llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-10-02 02:55:35 +00:00

Author	SHA1	Message	Date
Ahmed Bougacha	40453da779	[X86] 8bit divrem: Improve codegen for AH register extraction. For 8-bit divrems where the remainder is used, we used to generate: divb %sil shrw $8, %ax movzbl %al, %eax That was to avoid an H-reg access, which is problematic mainly because it isn't possible in REX-prefixed instructions. This patch optimizes that to: divb %sil movzbl %ah, %eax To do that, we explicitly extend AH, and extract the L-subreg in the resulting register. The extension is done using the NOREX variants of MOVZX. To support signed operations, MOVSX_NOREX is also added. Further, this introduces a new SDNode type, [us]divrem_ext_hreg, which is then lowered to a sequence containing a single zext (rather than 2). Differential Revision: http://reviews.llvm.org/D6064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221176 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 20:26:35 +00:00
Tom Stellard	fbd383c93c	Reapply: R600: Make sure to inline all internal functions Function calls aren't supported yet. This was reverted due to build breakages, which should be fixed now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221173 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 19:49:05 +00:00
Duncan P. N. Exon Smith	5e84760dde	IR: MDNode => Value: Instruction::getAllMetadataOtherThanDebugLoc() Change `Instruction::getAllMetadataOtherThanDebugLoc()` from a vector of `MDNode` to one of `Value`. Part of PR21433. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221167 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 18:13:57 +00:00
Charlie Turner	c3606b6b2e	Remove the cortex-a9-mp CPU. This CPU definition is redundant. The Cortex-A9 is defined as supporting multiprocessing extensions. Remove its definition and update appropriate tests. LLVM defines both a cortex-a9 CPU and a cortex-a9-mp CPU. The only difference between the two CPU definitions in ARM.td is that cortex-a9-mp contains the feature FeatureMP for multiprocessing extensions. This is redundant since the Cortex-A9 is defined as having multiprocessing extensions in the TRMs. armcc also defines the Cortex-A9 as having multiprocessing extensions by default. Change-Id: Ifcadaa6c322be0a33d9d2a39cfdd7da1d75981a7 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221166 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 17:38:00 +00:00
Oliver Stannard	e13ea1ddda	[AArch64] Fix miscompile of comparison with 0xffffffffffffffff Some literals in the AArch64 backend had 15 'f's rather than 16, causing comparisons with a constant 0xffffffffffffffff to be miscompiled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221157 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 15:28:40 +00:00
Sid Manning	0fc1662219	Handle ctor/init_array initialization. Hexagon was not calling InitializeELF and could not select between ctors and init_array. Phabricator revision: http://reviews.llvm.org/D6061 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221156 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 14:56:05 +00:00
Daniel Sanders	8c4cd2f07d	[mips] Remove unused prototype and variable. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221146 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-03 10:14:57 +00:00
Matt Arsenault	b6c9c729dd	R600: Don't unnecessarily repeat the register class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221119 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-02 23:46:59 +00:00
Matt Arsenault	37b154c175	R600/SI: Use REG_SEQUENCE instead of INSERT_SUBREGs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221118 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-02 23:46:54 +00:00
Matt Arsenault	2220408e1a	Support REG_SEQUENCE in tablegen. The problem is mostly that variadic output instruction aren't handled, so it is rejected for having an inconsistent number of operands, and then the right number of operands isn't emitted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221117 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-02 23:46:51 +00:00
Daniel Sanders	eaa221a23e	Re-commit r221056 and others with fix, "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC." sret arguments can never originate from an f128 argument so we detect sret arguments and push false into OriginalArgWasF128. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221102 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-02 16:09:29 +00:00
NAKAMURA Takumi	1808f3e89a	Revert r221056 and others, "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC." r221056 "[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC." r221058 "[mips] Fix unused variable warning introduced in r221056" r221059 "[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC." r221061 "Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC." It cuased an undefined behavior in LLVM :: CodeGen/Mips/return-vector.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221081 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-02 04:43:54 +00:00
Daniel Sanders	cfe761c9e6	Renamed CCState members that appear to misspell 'Processed' as 'Proceed'. NFC. Reviewers: rnk Reviewed By: rnk Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D5978 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221061 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 19:32:23 +00:00
Daniel Sanders	ea8769cbe8	[mips] Move all ByVal handling into CCState and tablegen-erated code. NFC. Summary: CCState already contains a byval implementation that is very similar to the Mips custom code. This patch merges the custom code into the existing common code and tablegen-erated code. Reviewers: vmedic Reviewed By: vmedic Subscribers: rnk, llvm-commits Differential Revision: http://reviews.llvm.org/D5977 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221059 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 19:17:10 +00:00
Daniel Sanders	73d60e69f4	[mips] Fix unused variable warning introduced in r221056 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 18:53:01 +00:00
Daniel Sanders	d1bb2d1f92	[mips] Remove ByValArgInfo::Address in favour of CCValAssign::getMemLocOffset(). NFC. Summary: ByValArgInfo is practically the same as CCState::ByValInfo now. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5976 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221057 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 18:44:56 +00:00
Daniel Sanders	61d2d3d245	[mips] Move F128 argument handling into MipsCCState as we did for returns. NFC. Summary: There are a couple more changes to make before analyzeFormalArguments can be merged into the standard AnalyzeFormalArguments. I've had to temporarily poke a couple holes in MipsCCState's encapsulation to save having to make all the required changes for this merge all at once. These will be removed shortly. We must merge our ByVal argument handling with the implementation in CCState. This will be done over the next three patches, then the fourth will merge analyzeFormalArguments with AnalyzeFormalArguments. Depends on D5967 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221056 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 18:38:03 +00:00
Daniel Sanders	20dc677021	[mips] Remove MipsCC::CCInfo. NFC. Summary: It's now passed in as an argument to functions that need it. Eventually this argument will be replaced by the 'this' pointer for a MipsCCState object. Depends on D5966 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5967 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 18:13:52 +00:00
Daniel Sanders	f814c6418d	[mips] Removed MipsCC::fixedArgFn(). NFC Summary: There is one remaining trace of it in MipsCC::analyzeCallOperands() where Mips16 might override the calling convention. This will moved into tablegen-erated code later. Depends on D5965 Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D5966 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221053 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 17:44:51 +00:00
Daniel Sanders	5e4b155521	[tablegen] Add CustomCallingConv and use it to tablegen-erate the outermost parts of the Mips O32 implementation Summary: CustomCallingConv is simply a CallingConv that tablegen should not generate the implementation for. It allows regular CallingConv's to delegate to these custom functions. This is (currently) necessary for Mips and we cannot use CCCustom without having to adapt to the different API that CCCustom uses. This brings us a bit closer to being able to remove MipsCC::analyzeCallOperands and MipsCC::analyzeFormalArguments in favour of the common implementation. No functional change to the targets. Depends on D3341 Reviewers: vmedic Reviewed By: vmedic Subscribers: vmedic, llvm-commits Differential Revision: http://reviews.llvm.org/D5965 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221052 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 17:38:22 +00:00
Rafael Espindola	5793838fc8	Remove redundant calls to isMaterializable. This removes calls to isMaterializable in the following cases: * It was redundant with a call to isDeclaration now that isDeclaration returns the correct answer for materializable functions. * It was followed by a call to Materialize. Just call Materialize and check EC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221050 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 16:46:18 +00:00
Daniel Sanders	e2add4346d	Revert r221048 - Test commit It seems I can't commit unless $DBUS_SESSION_BUS_ADDRESS is set correctly and it is not set for ssh sessions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221049 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 16:08:03 +00:00
Daniel Sanders	211ed6d001	Test commit Added some whitespace to debug some authentication issues I'm having. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221048 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 16:00:40 +00:00
Adrian Prantl	bdec4aee7b	Revert "Temporarily revert r220777 to sort out build bot breakage." This reverts commit r221028. Later commits depend on this and reverting just this one causes even more bots to fail. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221041 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 03:19:45 +00:00
NAKAMURA Takumi	5ea912ac8d	Revert r220779, "[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine." Since r221028 (reverting r220777), this caused failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221040 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 01:36:14 +00:00
Adrian Prantl	a4e1564971	Temporarily revert r220777 to sort out build bot breakage. "[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros." git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221028 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 00:26:59 +00:00
Duncan P. N. Exon Smith	3a84a6377c	IR: MDNode => Value: Instruction::getMetadata() Change `Instruction::getMetadata()` to return `Value` as part of PR21433. Update most callers to use `Instruction::getMDNode()`, which wraps the result in a `cast_or_null<MDNode>`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221024 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-01 00:10:31 +00:00
Reid Kleckner	8d2e511d3b	Revert "R600: Add missing file to CMakeLists.txt" This reverts commit r220998. It should've been reverted with the other change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221021 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 23:39:10 +00:00
Reid Kleckner	a5607fb841	Revert "R600: Make sure to inline all internal functions" This reverts commit r220996. It introduced layering violations causing link errors in many configurations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221020 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 23:35:26 +00:00
Reid Kleckner	e1a4787d5d	Work around bugs in MSVC "14" CTP 3's conversion logic It appears to ignore or find ambiguous MachineInstrBuilder's conversion operators that allow conversion to MachineInstr* and MachineBasicBlock::bundle_iterator. As a workaround, add an explicit way to get the MachineInstr. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 23:19:46 +00:00
Tom Stellard	03fb370534	R600: Add IPO to the list of required libraries git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221004 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 21:52:08 +00:00
Tom Stellard	7dd15e65fb	R600: Add missing file to CMakeLists.txt git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220998 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 20:56:36 +00:00
Tom Stellard	b5c86504a0	R600: Don't promote allocas when one of the users is a ptrtoint instruction We need to figure out how to track ptrtoint values all the way until result is converted back to a pointer in order to correctly rewrite the pointer type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 20:52:04 +00:00
Tom Stellard	5d6cee5e65	R600: Make sure to inline all internal functions Function calls aren't supported yet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220996 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 20:52:02 +00:00
Bill Schmidt	2d32816a45	[PowerPC] Initial VSX intrinsic support, with min/max for vector double Now that we have initial support for VSX, we can begin adding intrinsics for programmer access to VSX instructions. This patch adds basic support for VSX intrinsics in general, and tests it by implementing intrinsics for minimum and maximum for the vector double data type. The LLVM portion of this is quite straightforward. There is a companion patch for Clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220988 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 19:19:07 +00:00
Chad Rosier	b9ce924135	[AArch64] Check Dest Register Liveness in CondOpt pass. Our internal test reveals such case should not be transformed: cmp x17, #3 b.lt .LBB10_15 ... subs x12, x12, #1 b.gt .LBB10_1 where x12 is a liveout, becomes: cmp x17, #2 b.le .LBB10_15 ... subs x12, x12, #2 b.ge .LBB10_1 Unable to provide test case as it's difficult to reproduce on community branch. http://reviews.llvm.org/D6048 Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220987 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 19:02:38 +00:00
Quentin Colombet	9b6ca9304c	[CodeGenPrepare] Move extractelement close to store if they can be combined. This patch adds an optimization in CodeGenPrepare to move an extractelement right before a store when the target can combine them. The optimization may promote any scalar operations to vector operations in the way to make that possible. Context Some targets use different register files for both vector and scalar operations. This means that transitioning from one domain to another may incur copy from one register file to another. These copies are not coalescable and may be expensive. For example, according to the scheduling model, on cortex-A8 a vector to GPR move is 20 cycles. Motivating Example Let us consider an example: define void @foo(<2 x i32>* %addr1, i32* %dest) { %in1 = load <2 x i32>* %addr1, align 8 %extract = extractelement <2 x i32> %in1, i32 1 %out = or i32 %extract, 1 store i32 %out, i32* %dest, align 4 ret void } As it is, this IR generates the following assembly on armv7: vldr d16, [r0] @vector load vmov.32 r0, d16[1] @ cross-register-file copy: 20 cycles orr r0, r0, #1 @ scalar bitwise or str r0, [r1] @ scalar store bx lr Whereas we could generate much faster code: vldr d16, [r0] @ vector load vorr.i32 d16, #0x1 @ vector bitwise or vst1.32 {d16[1]}, [r1:32] @ vector extract + store bx lr Half of the computation made in the vector is useless, but this allows to get rid of the expensive cross-register-file copy. Proposed Solution To avoid this cross-register-copy penalty, we promote the scalar operations to vector operations. The penalty will be removed if we manage to promote the whole chain of computation in the vector domain. Currently, we do that only when the chain of computation ends by a store and the target is able to combine an extract with a store. Stores are the most likely candidates, because other instructions produce values that would need to be promoted and so, extracted as some point[1]. Moreover, this is customary that targets feature stores that perform a vector extract (see AArch64 and X86 for instance). The proposed implementation relies on the TargetTransformInfo to decide whether or not it is beneficial to promote a chain of computation in the vector domain. Unfortunately, this interface is rather inaccurate for this level of details and although this optimization may be beneficial for X86 and AArch64, the inaccuracy will lead to the optimization being too aggressive. Basically in TargetTransformInfo, everything that is legal has a cost of 1, whereas, even if a vector type is legal, usually a vector operation is slightly more expensive than its scalar counterpart. That will lead to too many promotions that may not be counter balanced by the saving of the cross-register-file copy. For instance, on AArch64 this penalty is just 4 cycles. For now, the optimization is just enabled for ARM prior than v8, since those processors have a larger penalty on cross-register-file copies, and the scope is limited to basic blocks. Because of these two factors, we limit the effects of the inaccuracy. Indeed, I did not want to build up a fancy cost model with block frequency and everything on top of that. [1] We can imagine targets that can combine an extractelement with other instructions than just stores. If we want to go into that direction, the current interfaces must be augmented and, moreover, I think this becomes a global isel problem. Differential Revision: http://reviews.llvm.org/D5921 <rdar://problem/14170854> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220978 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 17:52:53 +00:00
Chad Rosier	66d3a86a9a	[AArch64] CondOpt pass is missing FCMP instructions when searching backward for a CMP which defines the flags used by B.CC. http://reviews.llvm.org/D6047 Patch by Zhaoshi Zheng <zhaoshiz@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220961 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 15:17:36 +00:00
Ulrich Weigand	8a9c531e9a	[PowerPC] Load BlockAddress values from the TOC in 64-bit SVR4 code Since block address values can be larger than 2GB in 64-bit code, they cannot be loaded simply using an @l / @ha pair, but instead must be loaded from the TOC, just like GlobalAddress, ConstantPool, and JumpTable values are. The commit also fixes a bug in PPCLinuxAsmPrinter::doFinalization where temporary labels could not be used as TOC values, since code would attempt (and fail) to use GetOrCreateSymbol to create a symbol of the same name as the temporary label. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220959 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-31 10:33:14 +00:00
Robert Khasanov	7d18d46ef2	[AVX512] Added VBROADCAST{SS/SD} encoding for VL subset. Refactored through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220908 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-30 14:21:47 +00:00
Robert Khasanov	63c2f3292e	[AVX512] Implemented AVX512VL FP bnary packed instructions (VADDP, VSUBP, VMULP, VDIVP, VMAXP, VMINP) Refactored through AVX512_maskable Added encoding tests for them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220858 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-29 15:43:02 +00:00
Robert Khasanov	e1610162fb	[AVX512] Fix VSQRT packed instructions internal names. No functional change git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220808 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:22:41 +00:00
Robert Khasanov	9371efbcdb	[AVX512] Extended avx512_sqrt_packed (sqrt instructions) to VL subset. Refactored through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220806 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 18:15:20 +00:00
Robert Khasanov	59cb03d329	[AVX-512] Expanded rsqrt/rcp instructions to VL subset. Refactored multiclass through AVX512_maskable git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220783 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 16:37:13 +00:00
Robert Khasanov	d4345dd85f	[AVX512] Removed special case for cmp instructions in getVectorMaskingNode. Now cmp intrinsics lower as other intrinsics through VSELECT, and then VSELECT tranforms to AND in PerformSELECTCombine. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220779 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 16:17:14 +00:00
Robert Khasanov	edf556ec1f	[x86] Simplify vector selection if condition value type matches vselect value type and true value is all ones or false value is all zeros. This transformation worked if selector is produced by SETCC, however SETCC is needed only if we consider to swap operands. So I replaced SETCC check for this case. Added tests for vselect of <X x i1> values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220777 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 15:59:40 +00:00
Robert Khasanov	4a52493457	[AVX512] Bring back vector-shuffle lowering support through broadcasts Ffter commit at rev219046 512-bit broadcasts lowering become non-optimal. Most of tests on broadcasting and embedded broadcasting were changed and they doesn’t produce efficient code. Example below is from commit changes (it’s the first test from test/CodeGen/X86/avx512-vbroadcast.ll): define <16 x i32> @_inreg16xi32(i32 %a) { ; CHECK-LABEL: _inreg16xi32: ; CHECK: ## BB#0: -; CHECK-NEXT: vpbroadcastd %edi, %zmm0 +; CHECK-NEXT: vmovd %edi, %xmm0 +; CHECK-NEXT: vpbroadcastd %xmm0, %ymm0 +; CHECK-NEXT: vinserti64x4 $1, %ymm0, %zmm0, %zmm0 ; CHECK-NEXT: retq %b = insertelement <16 x i32> undef, i32 %a, i32 0 %c = shufflevector <16 x i32> %b, <16 x i32> undef, <16 x i32> zeroinitializer ret <16 x i32> %c } Here, 256-bit broadcast was generated instead of 512-bit one. In this patch 1) I added vector-shuffle lowering through broadcasts 2) Removed asserts and branches likes because this is incorrect - assert(Subtarget->hasDQI() && "We can only lower v8i64 with AVX-512-DQI"); 3) Fixed lowering tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220774 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 12:28:51 +00:00
Reid Kleckner	d5de327da0	X86: Implement the vectorcall calling convention This is a Microsoft calling convention that supports both x86 and x86_64 subtargets. It passes vector and floating point arguments in XMM0-XMM5, and passes them indirectly once they are consumed. Homogenous vector aggregates of up to four elements can be passed in sequential vector registers, but this part is not implemented in LLVM and will be handled in Clang. On 32-bit x86, it is similar to fastcall in that it uses ecx:edx as integer register parameters and is callee cleanup. On x86_64, it delegates to the normal win64 calling convention. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D5943 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220745 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 01:29:26 +00:00
Tim Northover	dd778c6c9f	AArch64: enable Cortex-A57 FP balancing on Cortex-A53. Benchmarks have shown that it's harmless to the performance there, and having a unified set of passes between the two cores where possible helps big.LITTLE deployment. Patch by Z. Zheng. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220744 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-28 01:24:32 +00:00
NAKAMURA Takumi	9ef2fd8775	AArch64InstrInfo.h: Fix a warning introduced in clang r220703. [-Winconsistent-missing-override] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220739 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-27 23:29:27 +00:00

1 2 3 4 5 ...

31191 Commits