llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-12-15 20:29:48 +00:00

Author	SHA1	Message	Date
Petar Jovanovic	01b026b023	Re-enable target-specific relocation table sorting and use it for Mips Some targets (ie. Mips) have additional rules for ordering the relocation table entries. Allow them to override generic sortRelocs(), which sorts entries by Offset. Then override this function for Mips, to emit HI16 and GOT16 relocations against the local symbol in pair with the corresponding LO16 relocation. Patch by Vladimir Stefanovic. Differential Revision: http://reviews.llvm.org/D7414 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234883 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-14 13:23:34 +00:00
Duncan P. N. Exon Smith	125e3d3959	DebugInfo: Gut DISubprogram and DILexicalBlock* Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses. Note that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-14 03:40:37 +00:00
Duncan P. N. Exon Smith	355ec009e5	DebugInfo: Gut DIVariable and DIGlobalVariable Gut all the non-pointer API from the variable wrappers, except an implicit conversion from `DIGlobalVariable` to `DIDescriptor`. Note that if you're updating out-of-tree code, `DIVariable` wraps `MDLocalVariable` (`MDVariable` is a common base class shared with `MDGlobalVariable`). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234840 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-14 02:22:36 +00:00
Krzysztof Parzyszek	57ea789c3e	Expand ADDO/SUBO on Hexagon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234795 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 20:37:01 +00:00
Jan Vesely	a017ce21ba	Revert revisions r234755, r234759, r234760 Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234768 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 17:47:15 +00:00
Krzysztof Parzyszek	fcc330abfe	Allow memory intrinsics to be tail calls git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234764 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 17:16:45 +00:00
Jan Vesely	4ce8b9c7fb	R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO v2: tighten the sub64 tests v3: rename to CARRY/BORROW v4: fixup test cmdline add known bits computation use sign extend instead of sub 0,x better add test v5: remove redundant break move lowering to separate functions fix comments Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234759 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 16:26:00 +00:00
Jan Vesely	b7239d3aa5	R600: Make FMIN/MAXNUM legal on all asics v2: Add tests Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234716 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-12 23:45:05 +00:00
Jan Vesely	8332e14bee	R600: remove manual BFE optimization Fixed since r233079 Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> reviewer: arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234715 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-12 23:45:01 +00:00
Hal Finkel	ad96655905	[PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrep When I fixed these a couple of days ago to iterate over all loops, not just depth == 1 loops, I inadvertently made it such that we'd only look at the first top-level loop. Make sure that we really look at all of them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234705 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-12 17:18:56 +00:00
Hal Finkel	1cea397876	[PowerPC] Disable part-word atomics on the P7 As it turns out, even though these are part of ISA 2.06, the P7 does not support them (or, at least, not any P7s we're tested so far). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234686 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-11 13:40:36 +00:00
Nemanja Ivanovic	9ca031b6c6	Add direct moves to/from VSR and exploit them for FP/INT conversions This patch corresponds to review: http://reviews.llvm.org/D8928 It adds direct move instructions to/from VSX registers to GPR's. These are exploited for FP <-> INT conversions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234682 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-11 10:40:42 +00:00
Alexander Kornienko	c16fc54851	Use 'override/final' instead of 'virtual' for overridden methods The patch is generated using clang-tidy misc-use-override check. This command was used: tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \ -checks='-*,misc-use-override' -header-filter='llvm\|clang' \ -j=32 -fix -format http://reviews.llvm.org/D8925 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234679 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-11 02:11:45 +00:00
Hal Finkel	ff49ef7838	[PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loops This pass had the same problem as the data-prefetching pass: it was only checking for depth == 1 loops in practice. Fix that, add some debugging statements, and make sure that, when we grab an AddRec, it is for the loop we expect. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234670 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-11 00:33:08 +00:00
Ahmed Bougacha	d2069333ee	[CodeGen] Split -enable-global-merge into ARM and AArch64 options. Currently, there's a single flag, checked by the pass itself. It can't force-enable the pass (and is on by default), because it might not even have been created, as that's the targets decision. Instead, have separate explicit flags, so that the decision is consistently made in the target. Keep the flag as a last-resort "force-disable GlobalMerge" for now, for backwards compatibility. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234666 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-11 00:06:36 +00:00
Quentin Colombet	b26cf11a7b	[AArch64] Strengthen the code for the prologue insertion. The spilled registers are pristine and thus, correctly handled by the register scavenger and so on, but the liveness information is strictly speaking wrong at this point. Fix that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234664 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 23:14:34 +00:00
Hal Finkel	d4f643a5df	[PowerPC] Prefetching should also consider depth > 1 loops Iterating over loops from the LoopInfo instance only provides top-level loops. We need to search the whole tree of loops to find the inner ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234603 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 15:05:02 +00:00
Toma Tabacu	3905bb619c	[mips] [IAS] Improve comments in MipsAsmParser::expandLoadImm. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234595 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 13:28:16 +00:00
Chad Rosier	0937a348d0	[AArch64] Changes some SchedAlias to WriteRes for Cortex-A57. Using SchedAliases is convenient and works well for latency and resource lookup for instructions. However, this creates an entry in AArch64WriteLatencyTable with a WriteResourceID of 0, breaking any SchedReadAdvance since the lookup will fail. http://reviews.llvm.org/D8043 Patch by Dave Estes <cestes@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234594 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 13:19:27 +00:00
Chad Rosier	694883f0f7	[AArch64] Adjusts Cortex-A57 machine model to handle zero shift. http://reviews.llvm.org/D8043 Patch by Dave Estes <cestes@codeaurora.org>! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234593 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 13:19:21 +00:00
Benjamin Kramer	0973b7ddb8	Reduce dyn_cast<> to isa<> or cast<> where possible. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234586 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 11:24:51 +00:00
Jingyue Wu	5733100450	Divergence analysis for GPU programs Summary: Some optimizations such as jump threading and loop unswitching can negatively affect performance when applied to divergent branches. The divergence analysis added in this patch conservatively estimates which branches in a GPU program can diverge. This information can then help LLVM to run certain optimizations selectively. Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll Reviewers: resistor, hfinkel, eliben, meheff, jholewinski Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8576 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234567 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 05:03:50 +00:00
Hal Finkel	36f934a207	[PowerPC] Don't crash on PPC32 i64 fp_to_uint on modern cores When we have an instruction for this (and, thus, don't generate a runtime call), we need to custom type legalize this (in a trivial way, just as we do for fp_to_sint). Fixes PR23173. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234561 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 03:39:00 +00:00
Ahmed Bougacha	1810ca3110	[AArch64] Promote f16 operations to f32. For the most common ones (such as fadd), we already did the promotion. Do the same thing for all the others. Currently, we'll just crash/assert on all these operations, as there's no hardware or libcall support whatsoever. f16 (half) is specified as an interchange - not arithmetic - format, and is expected to be promoted to single-precision for arithmetic operations. While there, teach the legalizer about promoting some of the (mostly floating-point) operations that we never needed before. Differential Revision: http://reviews.llvm.org/D8648 See related discussion on the thread for: http://reviews.llvm.org/D8755 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234550 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-10 00:08:48 +00:00
Nemanja Ivanovic	c58b8f0b65	Add LLVM support for remaining integer divide and permute instructions from ISA 2.06 This is the patch corresponding to review: http://reviews.llvm.org/D8406 It adds some missing instructions from ISA 2.06 to the PPC back end. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234546 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 23:54:37 +00:00
Rafael Espindola	427c073035	Simplify use of formatted_raw_ostream. formatted_raw_ostream is a wrapper over another stream to add column and line number tracking. It is used only for asm printing. This patch moves the its creation down to where we know we are printing assembly. This has the following advantages: * Simpler lifetime management: std::unique_ptr * We don't compute column and line number of object files :-) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234535 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 21:06:08 +00:00
Juergen Ributzka	117bf240ef	[AArch64][FastISel] Fix integer extend optimization. The integer extend optimization tries to fold the extend into the load instruction. This requires us to identify if the extend has already been emitted or not and act accordingly on it. The check that was originally performed for this was not sufficient. Besides checking the ValueMap for a mapped register we also need to check if the virtual register has already an associated machine instruction that defines it. This fixes rdar://problem/20470788. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234529 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 20:00:46 +00:00
Eric Christopher	a9fa357be4	Remove duplicated code and consolidate initializers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234525 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 19:20:37 +00:00
Rafael Espindola	7e0993d377	clang-format bits of code to make a followup patch easy to read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234519 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 18:32:58 +00:00
Rafael Espindola	440f0b28a8	Use a raw_svector_ostream instead of a raw_string_ostream. It saves a bit of copying. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234507 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 17:16:25 +00:00
Rafael Espindola	e4053cd377	Don't repeat name in comment. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234506 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 17:10:57 +00:00
Rafael Espindola	a93117c50d	This reverts commit r234460 and r234461. Revert "Add classof implementations to the raw_ostream classes." Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream." The underlying issue can be fixed without classof. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234495 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 15:54:59 +00:00
Javed Absar	28c2fda9df	[ARM] support for Cortex-R4/R4F Currently, llvm (backend) doesn't know cortex-r4, even though it is the default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes 'cortex-r4' is not a recognized processor for this target' by llvm. This patch adds support for cortex-r4 and, very closely related, r4f. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234486 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 14:07:28 +00:00
Toma Tabacu	52c5645668	[mips] Refactor saved-registers bitmask creation in MipsAsmPrinter::printSavedRegsBitmask. NFC. Summary: Make the code more readable by fusing the for-loops together and explicitly checking for each register class. Also, this version is more straightforward because it doesn't assume that FPU registers always come before CPU registers in the CalleeSavedInfo vector. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8033 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234475 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 10:54:16 +00:00
Kristof Beyls	2a6ad5bfb2	[AArch64] Add support for dynamic stack alignment Differential Revision: http://reviews.llvm.org/D8876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234471 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 08:49:47 +00:00
Lang Hames	64008ef318	[AArch64] Remove redundant -march option. Also fix a think-o from r234462. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 05:34:57 +00:00
Lang Hames	174f04eefb	[AArch64] Teach AArch64TargetLowering::getOptimalMemOpType to consider alignment restrictions when choosing a type for small-memcpy inlining in SelectionDAGBuilder. This ensures that the loads and stores output for the memcpy won't be further expanded during legalization, which would cause the total number of instructions for the memcpy to exceed (often significantly) the inlining thresholds. <rdar://problem/17829180> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234462 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 03:40:33 +00:00
Rafael Espindola	23295f613b	Use the cast machinery to remove dummy uses of formatted_raw_ostream. If we know we are producing an object, we don't need to wrap the stream in a formatted_raw_ostream anymore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234461 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-09 02:28:12 +00:00
Scott Douglass	7e2bc24e05	[ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math Because -menable-no-nans causes fcmp conditions to be rewritten without 'o' or 'u' the recognition code in needs to cope. Also extended it to handle 'le' and 'ge. Differential Revision: http://reviews.llvm.org/D8725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234421 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 17:18:28 +00:00
Toma Tabacu	739ca842aa	[mips] [IAS] Do not generate redundant move when expanding lw/sw with symbol. Summary: Even though there is no 2nd register operand in the "lw/sw $8, symbol" case, we still try to find one, and we end up with $0, which makes us generate an unnecessary "addu $8, $8, $0" (a.k.a. "move $8, $8"). We can avoid this by checking if the 2nd register operand is different from $0, before generating the addu. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8055 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234406 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 13:52:41 +00:00
Toma Tabacu	f716ca43ca	[mips] [IAS] Add support for the BNEZL and BEQZL pseudo-instructions. Summary: They are of the form "bnezl/beqzl $rs, offset" and expand to "bnel/beql $rs, $zero, offset". These instructions are used in Linux inline assembly. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8540 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234401 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 12:15:05 +00:00
Sergey Dmitrouk	a7512d1d4a	[ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 \| rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame. Reviewers: echristo, rengolin, asl, t.p.northover Reviewed By: t.p.northover Subscribers: llvm-commits, rengolin, aemerson Differential Revision: http://reviews.llvm.org/D8606 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234399 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 10:10:12 +00:00
Toma Tabacu	39fedc9aa2	[mips] [IAS] Remove AssemblerPredicate's from RelocPIC and RelocStatic. Summary: These AssemblerPredicate's are unnecessary and actually make some instructions unusable when assembling pre-MIPS32 ISAs. For example, this was causing the IAS to reject the 'j' instruction for MIPS I-V. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8300 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234398 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 10:06:45 +00:00
Alexei Starovoitov	34354ecc30	[bpf] support BPF backend as shared library dependencies were not set correctly for shared library build. static was ok Patch by Brenden Blanco. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234386 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 03:46:51 +00:00
Tom Stellard	65bae699c4	R600/SI: Add some missing overrides git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234384 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 02:07:05 +00:00
Tom Stellard	a787066317	R600/SI: Initial support for assembler and inline assembly This is currently considered experimental, but most of the more commonly used instructions should work. So far only SI has been extensively tested, CI and VI probably work too, but may be buggy. The current set of tests cases do not give complete coverage, but I think it is sufficient for an experimental assembler. See the documentation in R600Usage for more information. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234381 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 01:09:26 +00:00
Tom Stellard	c0578c36bb	R600/SI: Add missing SOPK instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234380 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 01:09:22 +00:00
Tom Stellard	434e097df8	R600/SI: Don't print offset0/offset1 DS operands when they are 0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234379 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-08 01:09:19 +00:00
Tim Northover	112102c7fe	AArch64: disallow "fmov sD, #-0.0" during assembly. We weren't checking the sign of the floating point immediate before translating it to "fmov sD, wzr". Similarly for D-regs. Technically "movi vD.2s, #0x80, lsl #24" would work most of the time, but it's not a blessed alias (and I don't think it should be since people expect writing sD to zero out the high lanes, and there's no dD equivalent). So an error it is. rdar://20455398 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234372 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 22:49:47 +00:00
Ahmed Bougacha	236bac7a87	[ARM] Mark a bunch of .td Operands with type _MEMORY. This shouldn't affect anything in-tree, as the OperandType users are mostly smart disassemblers and such; more information is helpful there. However, on the flip side, that + the fact that this is just hinting at the meaning of operands makes this not really test-worthy or testable. Differential Revision: http://reviews.llvm.org/D8620 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234350 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 20:31:16 +00:00
Alexei Starovoitov	68e6b651a1	[bpf] fix build fix the build and remove unused variable warnings in Release mode. Patch by Brenden Blanco. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234349 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 20:25:34 +00:00
Matthias Braun	0d8314f6f3	AArch64: Don't lower ISD::SELECT to ISD::SELECT_CC Instead of lowering SELECT to SELECT_CC which is further lowered later immediately call the SELECT_CC lowering code. This is preferable because: - Avoids an unnecessary roundtrip through the legalization queues with an intermediate node. - More importantly: Lowered operations get visited last leading to SELECT_CC getting visited with legalized operands and unlegalized ones for preexisting SELECT_CC nodes. This does not hurt the current code (hence no testcase) but is required for another patch I am working on. Differential Revision: http://reviews.llvm.org/D8187 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234334 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 17:33:05 +00:00
Toma Tabacu	db3b3a0b9f	[mips] [IAS] Allow .set assignments for already defined symbols. Summary: This is not possible when using the IAS for MIPS, but it is possible when using the IAS for other architectures and when using GAS for MIPS. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8578 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234316 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 13:59:39 +00:00
Rafael Espindola	838c24a7c8	Refactor a lot of duplicated code for stub output. This also moves it earlier so that it they are produced before we print an end symbol for the data section. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234315 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 13:42:44 +00:00
Aaron Ballman	f9535e2b1e	Silencing several "enumeral and non-enumeral type in conditional expression" warnings; NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234314 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-07 13:28:37 +00:00
Duncan P. N. Exon Smith	0477045c32	CodeGen: Stop using DIDescriptor::is*() and auto-casting Same as r234255, but for lib/CodeGen and lib/Target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234258 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-06 23:27:40 +00:00
Tim Northover	8af3f965e0	ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available. After recognising that a certain narrow instruction might need a relocation to be represented, we used to unconditionally relax it to a Thumb2 instruction to permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2 instructions, so we end up emitting a completely invalid instruction. Theoretically, ELF does have relocations for these situations; but they are fairly unusable with such short ranges and the ABI document even says they're documented "for completeness". So an error is probably better there too. rdar://20391953 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234195 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-06 18:44:42 +00:00
Simon Pilgrim	2ec7242600	[X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction. This allows folding of byte loads and reduces scalar register usage as well. Differential Revision: http://reviews.llvm.org/D8839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234193 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-06 18:39:00 +00:00
Rafael Espindola	e17e7a2400	Remove unnecessary uses of AliasedSymbol. As pr19627 points out, every use of AliasedSymbol is likely a bug. The main use was to avoid the oddity of a variable showing up as undefined. That was fixed in r233995, which made these calls nops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234169 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-06 16:10:05 +00:00
Rafael Espindola	83be4429b2	Store the sh_link of ARM_EXIDX directly in MCSectionELF. This avoids some pretty horrible and broken name based section handling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234142 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-06 04:25:18 +00:00
Rafael Espindola	903f4a2051	Implement unique sections with an unique ID. This allows the compiler/assembly programmer to switch back to a section. This in turn fixes the bootstrap failure on powerpc (tested on gcc110) without changing the ppc codegen at all. I will try to cleanup the various getELFSection overloads in a followup patch. Just using a default argument now would lead to ambiguities. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234099 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-04 18:02:01 +00:00
Craig Topper	4d1e15c54f	[X86] Apply AddedComplexity consistently for similar patterns. This keeps them together in the DAGISel tables and reduces table size slightly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234086 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-04 04:22:12 +00:00
Craig Topper	4ed0907298	[X86] Add a comment about the change in r234075. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234079 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-04 02:31:43 +00:00
Craig Topper	b1ff87ec86	[X86] Don't use GR64 register 'and with immediate' instructions if the immediate is zero in the upper 33-bits or upper 57-bits. Use GR32 instructions instead. Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero. Fixes PR23100. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234075 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-04 02:08:20 +00:00
David Majnemer	f89ce9a09d	[WinEH] Sink UnwindHelp completely out of IR We don't need to represent UnwindHelp in IR. Instead, we can use the knowledge that we are emitting the parent function to decide if we should create the UnwindHelp stack object. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234061 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 22:32:26 +00:00
David Blaikie	4a86b381a3	[opaque pointer type] More GEP IRBuilder API migrations... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234058 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 21:33:42 +00:00
David Blaikie	cf57d81b6e	[opaque pointer type] More GEP API migrations in IRBuilder uses The plan here is to push the API changes out from the common components (like Constant::getGetElementPtr and IRBuilder::CreateGEP related functions) and just update callers to either pass the type if it's obvious, or pass null. Do this with LoadInst as well and anything else that comes up, then to start porting specific uses to not pass null anymore - this may require some refactoring in each case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234042 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 19:41:44 +00:00
Duncan P. N. Exon Smith	f4f021c0a4	CodeGen: Assert that inlined-at locations agree As a follow-up to r234021, assert that a debug info intrinsic variable's `MDLocalVariable::getInlinedAt()` always matches the `MDLocation::getInlinedAt()` of its `!dbg` attachment. The goal here is to get rid of `MDLocalVariable::getInlinedAt()` entirely (PR22778), but I'll let these assertions bake for a while first. If you have an out-of-tree backend that just broke, you're probably attaching the wrong `DebugLoc` to a `DBG_VALUE` instruction. The one you want is the location that was attached to the corresponding `@llvm.dbg.declare` or `@llvm.dbg.value` call that you started with. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 19:20:26 +00:00
Simon Pilgrim	be149a8148	[X86] Added SSE4.2 CRC32 memory folding patterns + tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234013 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 14:24:40 +00:00
Bill Schmidt	1c4b6de5ed	[PowerPC] Enable splat generation for BUILD_VECTOR with little endian When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR nodes for little endian because wrong results were produced. I've subsequently investigated and found this is due to a call to BuildVectorSDNode::isConstantSplat that was always specifying big-endian. With this changed to correctly identify the target endianness, the optimizations work as expected. I found another case of a call to the same method with big-endian hardcoded, in PPC::isAllNegativeZeroVector(). I discovered this was an orphaned method with no callers, so I've just removed it. The existing test/CodeGen/PowerPC/vec_constants.ll checks these optimizations, so for testing I've just added a variant for little endian. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234011 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 13:48:24 +00:00
Simon Pilgrim	e5ecd32488	[X86][3DNow] Added 3DNow! memory folding patterns + tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234008 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 11:50:30 +00:00
Peter Collingbourne	c39f5dd0e2	MC: For variable symbols, maintain MCSymbol::Section as a cache. Fixes PR19582. Previously, when an asm assignment (.set or =) was created, we would look up the section immediately in MCSymbol::setVariableValue. This caused symbols to receive the wrong section if the RHS of the assignment had not been seen yet. This had a knock-on effect in the object file emitters, causing them to emit extra symbols, or to give symbols the wrong visibility or the wrong section. For example, in the following asm: .data .Llocal: .text leaq .Llocal1(%rip), %rdi .Llocal1 = .Llocal2 .Llocal2 = .Llocal the first assignment would give .Llocal1 a null section, which would never get fixed up by the second assignment. This would cause the ELF object file emitter to consider .Llocal1 to be an undefined symbol and give it external linkage, even though .Llocal1 should not have been emitted at all in the object file. Or in the following asm: alias_to_local = Ltmp0 Ltmp0: the Mach-O object file emitter would give the alias_to_local symbol a n_type of N_SECT and a n_sect of 0. This is invalid under the Mach-O specification, which requires N_SECT symbols to receive a non-zero section number if the symbol is defined in a section in the object file. https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist After this change we do not look up the section when the assignment is created, but instead look it up on demand and store it in Section, which is treated as a cache if the symbol is a variable symbol. This change also fixes a bug in MCExpr::FindAssociatedSection. Previously, if we saw a subtraction, we would return the first referenced section, even in cases where we should have been returning the absolute pseudo-section. Now we always return the absolute pseudo-section for expressions that subtract two section-derived expressions. This isn't always correct (e.g. if one of the sections ends up being laid out at an absolute address), but it's probably the best we can do without more context. This allows us to remove code in two places where we appear to have been working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols and in X86AsmPrinter::EmitStartOfAsmFile. Re-applies r233595 (aka D8586), which was reverted in r233898. Differential Revision: http://reviews.llvm.org/D8798 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233995 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 01:46:11 +00:00
Matthias Braun	67a2b82b14	ARM: Handle physreg targets in RegPair hints gracefully Register coalescing can change the target of a RegPair hint to a physreg, we should not crash on this. This also slightly improved the way ARMBaseRegisterInfo::updateRegAllocHint() works. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233987 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 00:18:38 +00:00
Sanjay Patel	5b93ab6cde	[AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector Without this patch, we split the 256-bit vector into halves and produced something like: movzwl (%rdi), %eax vmovd %eax, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vblendps $15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7] Now, we eliminate the xor and blend because those zeros are free with the vmovd: movzwl (%rdi), %eax vmovd %eax, %xmm0 This should be the final fix needed to resolve PR22685: https://llvm.org/bugs/show_bug.cgi?id=22685 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233941 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 20:21:52 +00:00
David Blaikie	19443c1bcb	[opaque pointer type] API migration for GEP constant factories Require the pointee type to be passed explicitly and assert that it is correct. For now it's possible to pass nullptr here (and I've done so in a few places in this patch) but eventually that will be disallowed once all clients have been updated or removed. It'll be a long road to get all the way there... but if you have the cahnce to update your callers to pass the type explicitly without depending on a pointer's element type, that would be a good thing to do soon and a necessary thing to do eventually. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233938 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 18:55:32 +00:00
Quentin Colombet	fa8f2103a5	[AArch64] Add a comment to make it explicit why we increased the complexity. Follow-up of r233653. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233936 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 18:54:23 +00:00
Sanjay Patel	8765e82c83	[X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073) For code like this: define <8 x i32> @load_v8i32() { ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0> } We produce this AVX code: _load_v8i32: ## @load_v8i32 movl $7, %eax vmovd %eax, %xmm0 vxorps %ymm1, %ymm1, %ymm1 vblendps $1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7] retq There are at least 2 bugs in play here: We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs). We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd. The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem. [1] AddedComplexity has close to no documentation in the source. The best we have is this comment: "roughly corresponds to the number of nodes that are covered". It appears that x86 has bastardized this definition by inflating its values for some other undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). I searched my way to this page: https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ Differential Revision: http://reviews.llvm.org/D8794 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 17:56:17 +00:00
Vasileios Kalintiris	3fabd02ea5	[mips] Implement eliminateCallFramePseudoInstr() in MipsFrameLowering. NFC. Summary: Avoid duplicate code in Mips16FrameLowering and MipsSEFrameLowering by providing an implementation of the eliminateCallFramePseudoInstr() function from their base class. Depends on D8640. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8641 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233909 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 11:09:40 +00:00
Elena Demikhovsky	4eb165220f	AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB by Asaf Badouh (asaf.badouh@intel.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233906 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 10:51:40 +00:00
Vasileios Kalintiris	91a9b15130	[mips] Expose adjustStackPtr() from MipsInstrInfo. NFC. Summary: adjustStackPtr() is implemented from both MipsSEInstrInfo and Mips16InstrInfo. It makes sense to expose this function from MipsInstrInfo and avoid explicit casting in some cases. Depends on D8638. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8640 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233905 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 10:42:44 +00:00
Vasileios Kalintiris	cb64898a22	[mips] Make sure that we don't adjust the stack pointer by zero amount. Reviewers: dsanders Reviewed By: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8638 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233904 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 10:14:54 +00:00
Peter Collingbourne	a8432640e8	Revert r233595, "MC: For variable symbols, maintain MCSymbol::Section as a cache." git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233898 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-02 07:02:51 +00:00
Benjamin Kramer	a066ed09db	[X86] Don't accidentally select shll $1, %eax when shrinking an immediate. addl has higher throughput and this was needlessly picking a suboptimal encoding causing PR23098. I wish there was a way of doing this without further duplicating tbl- generated patterns, but so far I haven't found one. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233832 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-01 19:01:09 +00:00
Vladimir Sukharev	0751793310	[ARM] Rename v8.1a from "extension" to "architecture" v8.1a is renamed to architecture, following current entity naming approach. Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8767 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233811 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-01 14:54:56 +00:00
Vladimir Sukharev	65f303fd0c	[AArch64] Rename v8.1a from "extension" to "architecture" v8.1a is renamed to architecture, accordingly to approaches in ARM backend. Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature Reviewers: jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8766 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233810 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-01 14:49:29 +00:00
Ulrich Weigand	adf55a5a57	[SystemZ] Support transactional execution on zEC12 The zEC12 provides the transactional-execution facility. This is exposed to users via a set of builtin routines on other compilers. This patch adds LLVM support to enable those builtins. In partciular, the patch: - adds the transactional-execution and processor-assist facilities - adds MC support for all instructions provided by those facilities - adds LLVM intrinsics for those instructions and hooks them up for CodeGen - adds CodeGen support to optimize CC return value checking Since this is first use of target-specific intrinsics on the platform, the patch creates the include/llvm/IR/IntrinsicsSystemZ.td file and hooks it up in Intrinsics.td. I've also changed Triple::getArchTypePrefix to return "s390" instead of "systemz", since the naming convention for GCC intrinsics uses "s390" on the platform, and it neemed more straight- forward to use the same convention for LLVM IR intrinsics. An associated clang patch makes the intrinsics (and command line switches) available at the source-language level. For reference, the transactional-execution instructions are documented in the z/Architecture Principles of Operation for the zEC12: http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf The associated builtins are documented in the GCC manual: http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html Index: llvm-head/lib/Target/SystemZ/SystemZOperators.td =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZOperators.td +++ llvm-head/lib/Target/SystemZ/SystemZOperators.td @@ -79,6 +79,9 @@ def SDT_ZI32Intrinsic : SDTypeProf def SDT_ZPrefetch : SDTypeProfile<0, 2, [SDTCisVT<0, i32>, SDTCisPtrTy<1>]>; +def SDT_ZTBegin : SDTypeProfile<0, 2, + [SDTCisPtrTy<0>, + SDTCisVT<1, i32>]>; //===----------------------------------------------------------------------===// // Node definitions @@ -180,6 +183,15 @@ def z_prefetch : SDNode<"System [SDNPHasChain, SDNPMayLoad, SDNPMayStore, SDNPMemOperand]>; +def z_tbegin : SDNode<"SystemZISD::TBEGIN", SDT_ZTBegin, + [SDNPHasChain, SDNPOutGlue, SDNPMayStore, + SDNPSideEffect]>; +def z_tbegin_nofloat : SDNode<"SystemZISD::TBEGIN_NOFLOAT", SDT_ZTBegin, + [SDNPHasChain, SDNPOutGlue, SDNPMayStore, + SDNPSideEffect]>; +def z_tend : SDNode<"SystemZISD::TEND", SDTNone, + [SDNPHasChain, SDNPOutGlue, SDNPSideEffect]>; + //===----------------------------------------------------------------------===// // Pattern fragments //===----------------------------------------------------------------------===// Index: llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZInstrFormats.td +++ llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td @@ -473,6 +473,17 @@ class InstSS<bits<8> op, dag outs, dag i let Inst{15-0} = BD2; } +class InstS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern> + : InstSystemZ<4, outs, ins, asmstr, pattern> { + field bits<32> Inst; + field bits<32> SoftFail = 0; + + bits<16> BD2; + + let Inst{31-16} = op; + let Inst{15-0} = BD2; +} + //===----------------------------------------------------------------------===// // Instruction definitions with semantics //===----------------------------------------------------------------------===// Index: llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZInstrInfo.td +++ llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td @@ -1362,6 +1362,60 @@ let Defs = [CC] in { } //===----------------------------------------------------------------------===// +// Transactional execution +//===----------------------------------------------------------------------===// + +let Predicates = [FeatureTransactionalExecution] in { + // Transaction Begin + let hasSideEffects = 1, mayStore = 1, + usesCustomInserter = 1, Defs = [CC] in { + def TBEGIN : InstSIL<0xE560, + (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), + "tbegin\t$BD1, $I2", + [(z_tbegin bdaddr12only:$BD1, imm32zx16:$I2)]>; + def TBEGIN_nofloat : Pseudo<(outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), + [(z_tbegin_nofloat bdaddr12only:$BD1, + imm32zx16:$I2)]>; + def TBEGINC : InstSIL<0xE561, + (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2), + "tbeginc\t$BD1, $I2", + [(int_s390_tbeginc bdaddr12only:$BD1, + imm32zx16:$I2)]>; + } + + // Transaction End + let hasSideEffects = 1, Defs = [CC], BD2 = 0 in + def TEND : InstS<0xB2F8, (outs), (ins), "tend", [(z_tend)]>; + + // Transaction Abort + let hasSideEffects = 1, isTerminator = 1, isBarrier = 1 in + def TABORT : InstS<0xB2FC, (outs), (ins bdaddr12only:$BD2), + "tabort\t$BD2", + [(int_s390_tabort bdaddr12only:$BD2)]>; + + // Nontransactional Store + let hasSideEffects = 1 in + def NTSTG : StoreRXY<"ntstg", 0xE325, int_s390_ntstg, GR64, 8>; + + // Extract Transaction Nesting Depth + let hasSideEffects = 1 in + def ETND : InherentRRE<"etnd", 0xB2EC, GR32, (int_s390_etnd)>; +} + +//===----------------------------------------------------------------------===// +// Processor assist +//===----------------------------------------------------------------------===// + +let Predicates = [FeatureProcessorAssist] in { + let hasSideEffects = 1, R4 = 0 in + def PPA : InstRRF<0xB2E8, (outs), (ins GR64:$R1, GR64:$R2, imm32zx4:$R3), + "ppa\t$R1, $R2, $R3", []>; + def : Pat<(int_s390_ppa_txassist GR32:$src), + (PPA (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GR32:$src, subreg_l32), + 0, 1)>; +} + +//===----------------------------------------------------------------------===// // Miscellaneous Instructions. //===----------------------------------------------------------------------===// Index: llvm-head/lib/Target/SystemZ/SystemZProcessors.td =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZProcessors.td +++ llvm-head/lib/Target/SystemZ/SystemZProcessors.td @@ -60,6 +60,16 @@ def FeatureMiscellaneousExtensions : Sys "Assume that the miscellaneous-extensions facility is installed" >; +def FeatureTransactionalExecution : SystemZFeature< + "transactional-execution", "TransactionalExecution", + "Assume that the transactional-execution facility is installed" +>; + +def FeatureProcessorAssist : SystemZFeature< + "processor-assist", "ProcessorAssist", + "Assume that the processor-assist facility is installed" +>; + def : Processor<"generic", NoItineraries, []>; def : Processor<"z10", NoItineraries, []>; def : Processor<"z196", NoItineraries, @@ -70,4 +80,5 @@ def : Processor<"zEC12", NoItineraries, [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord, FeatureFPExtension, FeaturePopulationCount, FeatureFastSerialization, FeatureInterlockedAccess1, - FeatureMiscellaneousExtensions]>; + FeatureMiscellaneousExtensions, + FeatureTransactionalExecution, FeatureProcessorAssist]>; Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.cpp +++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp @@ -40,6 +40,7 @@ SystemZSubtarget::SystemZSubtarget(const HasLoadStoreOnCond(false), HasHighWord(false), HasFPExtension(false), HasPopulationCount(false), HasFastSerialization(false), HasInterlockedAccess1(false), HasMiscellaneousExtensions(false), + HasTransactionalExecution(false), HasProcessorAssist(false), TargetTriple(TT), InstrInfo(initializeSubtargetDependencies(CPU, FS)), TLInfo(TM, this), TSInfo(TM.getDataLayout()), FrameLowering() {} Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.h =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.h +++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.h @@ -42,6 +42,8 @@ protected: bool HasFastSerialization; bool HasInterlockedAccess1; bool HasMiscellaneousExtensions; + bool HasTransactionalExecution; + bool HasProcessorAssist; private: Triple TargetTriple; @@ -102,6 +104,12 @@ public: return HasMiscellaneousExtensions; } + // Return true if the target has the transactional-execution facility. + bool hasTransactionalExecution() const { return HasTransactionalExecution; } + + // Return true if the target has the processor-assist facility. + bool hasProcessorAssist() const { return HasProcessorAssist; } + // Return true if GV can be accessed using LARL for reloc model RM // and code model CM. bool isPC32DBLSymbol(const GlobalValue GV, Reloc::Model RM, Index: llvm-head/lib/Support/Triple.cpp =================================================================== --- llvm-head.orig/lib/Support/Triple.cpp +++ llvm-head/lib/Support/Triple.cpp @@ -92,7 +92,7 @@ const char Triple::getArchTypePrefix(Ar case sparcv9: case sparc: return "sparc"; - case systemz: return "systemz"; + case systemz: return "s390"; case x86: case x86_64: return "x86"; Index: llvm-head/include/llvm/IR/Intrinsics.td =================================================================== --- llvm-head.orig/include/llvm/IR/Intrinsics.td +++ llvm-head/include/llvm/IR/Intrinsics.td @@ -634,3 +634,4 @@ include "llvm/IR/IntrinsicsNVVM.td" include "llvm/IR/IntrinsicsMips.td" include "llvm/IR/IntrinsicsR600.td" include "llvm/IR/IntrinsicsBPF.td" +include "llvm/IR/IntrinsicsSystemZ.td" Index: llvm-head/include/llvm/IR/IntrinsicsSystemZ.td =================================================================== --- /dev/null +++ llvm-head/include/llvm/IR/IntrinsicsSystemZ.td @@ -0,0 +1,46 @@ +//===- IntrinsicsSystemZ.td - Defines SystemZ intrinsics ---- tablegen --===// +// +// The LLVM Compiler Infrastructure +// +// This file is distributed under the University of Illinois Open Source +// License. See LICENSE.TXT for details. +// +//===----------------------------------------------------------------------===// +// +// This file defines all of the SystemZ-specific intrinsics. +// +//===----------------------------------------------------------------------===// + +//===----------------------------------------------------------------------===// +// +// Transactional-execution intrinsics +// +//===----------------------------------------------------------------------===// + +let TargetPrefix = "s390" in { + def int_s390_tbegin : Intrinsic<[llvm_i32_ty], [llvm_ptr_ty, llvm_i32_ty], + [IntrNoDuplicate]>; + + def int_s390_tbegin_nofloat : Intrinsic<[llvm_i32_ty], + [llvm_ptr_ty, llvm_i32_ty], + [IntrNoDuplicate]>; + + def int_s390_tbeginc : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty], + [IntrNoDuplicate]>; + + def int_s390_tabort : Intrinsic<[], [llvm_i64_ty], + [IntrNoReturn, Throws]>; + + def int_s390_tend : GCCBuiltin<"__builtin_tend">, + Intrinsic<[llvm_i32_ty], []>; + + def int_s390_etnd : GCCBuiltin<"__builtin_tx_nesting_depth">, + Intrinsic<[llvm_i32_ty], [], [IntrNoMem]>; + + def int_s390_ntstg : Intrinsic<[], [llvm_i64_ty, llvm_ptr64_ty], + [IntrReadWriteArgMem]>; + + def int_s390_ppa_txassist : GCCBuiltin<"__builtin_tx_assist">, + Intrinsic<[], [llvm_i32_ty]>; +} + Index: llvm-head/lib/Target/SystemZ/SystemZ.h =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZ.h +++ llvm-head/lib/Target/SystemZ/SystemZ.h @@ -68,6 +68,18 @@ const unsigned CCMASK_TM_MSB_0 = C const unsigned CCMASK_TM_MSB_1 = CCMASK_2 \| CCMASK_3; const unsigned CCMASK_TM = CCMASK_ANY; +// Condition-code mask assignments for TRANSACTION_BEGIN. +const unsigned CCMASK_TBEGIN_STARTED = CCMASK_0; +const unsigned CCMASK_TBEGIN_INDETERMINATE = CCMASK_1; +const unsigned CCMASK_TBEGIN_TRANSIENT = CCMASK_2; +const unsigned CCMASK_TBEGIN_PERSISTENT = CCMASK_3; +const unsigned CCMASK_TBEGIN = CCMASK_ANY; + +// Condition-code mask assignments for TRANSACTION_END. +const unsigned CCMASK_TEND_TX = CCMASK_0; +const unsigned CCMASK_TEND_NOTX = CCMASK_2; +const unsigned CCMASK_TEND = CCMASK_TEND_TX \| CCMASK_TEND_NOTX; + // The position of the low CC bit in an IPM result. const unsigned IPM_CC = 28; Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.h =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.h +++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.h @@ -146,6 +146,15 @@ enum { // Perform a serialization operation. (BCR 15,0 or BCR 14,0.) SERIALIZE, + // Transaction begin. The first operand is the chain, the second + // the TDB pointer, and the third the immediate control field. + // Returns chain and glue. + TBEGIN, + TBEGIN_NOFLOAT, + + // Transaction end. Just the chain operand. Returns chain and glue. + TEND, + // Wrappers around the inner loop of an 8- or 16-bit ATOMIC_SWAP or // ATOMIC_LOAD_<op>. // @@ -318,6 +327,7 @@ private: SDValue lowerSTACKSAVE(SDValue Op, SelectionDAG &DAG) const; SDValue lowerSTACKRESTORE(SDValue Op, SelectionDAG &DAG) const; SDValue lowerPREFETCH(SDValue Op, SelectionDAG &DAG) const; + SDValue lowerINTRINSIC_W_CHAIN(SDValue Op, SelectionDAG &DAG) const; // If the last instruction before MBBI in MBB was some form of COMPARE, // try to replace it with a COMPARE AND BRANCH just before MBBI. @@ -355,6 +365,10 @@ private: MachineBasicBlock emitStringWrapper(MachineInstr MI, MachineBasicBlock BB, unsigned Opcode) const; + MachineBasicBlock emitTransactionBegin(MachineInstr MI, + MachineBasicBlock MBB, + unsigned Opcode, + bool NoFloat) const; }; } // end namespace llvm Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp =================================================================== --- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.cpp +++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp @@ -20,6 +20,7 @@ #include "llvm/CodeGen/MachineInstrBuilder.h" #include "llvm/CodeGen/MachineRegisterInfo.h" #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h" +#include "llvm/IR/Intrinsics.h" #include <cctype> using namespace llvm; @@ -304,6 +305,9 @@ SystemZTargetLowering::SystemZTargetLowe // Codes for which we want to perform some z-specific combinations. setTargetDAGCombine(ISD::SIGN_EXTEND); + // Handle intrinsics. + setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom); + // We want to use MVC in preference to even a single load/store pair. MaxStoresPerMemcpy = 0; MaxStoresPerMemcpyOptSize = 0; @@ -1031,6 +1035,53 @@ prepareVolatileOrAtomicLoad(SDValue Chai return DAG.getNode(SystemZISD::SERIALIZE, DL, MVT::Other, Chain); } +// Return true if Op is an intrinsic node with chain that returns the CC value +// as its only (other) argument. Provide the associated SystemZISD opcode and +// the mask of valid CC values if so. +static bool isIntrinsicWithCCAndChain(SDValue Op, unsigned &Opcode, + unsigned &CCValid) { + unsigned Id = cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue(); + switch (Id) { + case Intrinsic::s390_tbegin: + Opcode = SystemZISD::TBEGIN; + CCValid = SystemZ::CCMASK_TBEGIN; + return true; + + case Intrinsic::s390_tbegin_nofloat: + Opcode = SystemZISD::TBEGIN_NOFLOAT; + CCValid = SystemZ::CCMASK_TBEGIN; + return true; + + case Intrinsic::s390_tend: + Opcode = SystemZISD::TEND; + CCValid = SystemZ::CCMASK_TEND; + return true; + + default: + return false; + } +} + +// Emit an intrinsic with chain with a glued value instead of its CC result. +static SDValue emitIntrinsicWithChainAndGlue(SelectionDAG &DAG, SDValue Op, + unsigned Opcode) { + // Copy all operands except the intrinsic ID. + unsigned NumOps = Op.getNumOperands(); + SmallVector<SDValue, 6> Ops; + Ops.reserve(NumOps - 1); + Ops.push_back(Op.getOperand(0)); + for (unsigned I = 2; I < NumOps; ++I) + Ops.push_back(Op.getOperand(I)); + + assert(Op->getNumValues() == 2 && "Expected only CC result and chain"); + SDVTList RawVTs = DAG.getVTList(MVT::Other, MVT::Glue); + SDValue Intr = DAG.getNode(Opcode, SDLoc(Op), RawVTs, Ops); + SDValue OldChain = SDValue(Op.getNode(), 1); + SDValue NewChain = SDValue(Intr.getNode(), 0); + DAG.ReplaceAllUsesOfValueWith(OldChain, NewChain); + return Intr; +} + // CC is a comparison that will be implemented using an integer or // floating-point comparison. Return the condition code mask for // a branch on true. In the integer case, CCMASK_CMP_UO is set for @@ -1588,9 +1639,53 @@ static void adjustForTestUnderMask(Selec C.CCMask = NewCCMask; } +// Return a Comparison that tests the condition-code result of intrinsic +// node Call against constant integer CC using comparison code Cond. +// Opcode is the opcode of the SystemZISD operation for the intrinsic +// and CCValid is the set of possible condition-code results. +static Comparison getIntrinsicCmp(SelectionDAG &DAG, unsigned Opcode, + SDValue Call, unsigned CCValid, uint64_t CC, + ISD::CondCode Cond) { + Comparison C(Call, SDValue()); + C.Opcode = Opcode; + C.CCValid = CCValid; + if (Cond == ISD::SETEQ) + // bit 3 for CC==0, bit 0 for CC==3, always false for CC>3. + C.CCMask = CC < 4 ? 1 << (3 - CC) : 0; + else if (Cond == ISD::SETNE) + // ...and the inverse of that. + C.CCMask = CC < 4 ? ~(1 << (3 - CC)) : -1; + else if (Cond == ISD::SETLT \|\| Cond == ISD::SETULT) + // bits above bit 3 for CC==0 (always false), bits above bit 0 for CC==3, + // always true for CC>3. + C.CCMask = CC < 4 ? -1 << (4 - CC) : -1; + else if (Cond == ISD::SETGE \|\| Cond == ISD::SETUGE) + // ...and the inverse of that. + C.CCMask = CC < 4 ? ~(-1 << (4 - CC)) : 0; + else if (Cond == ISD::SETLE \|\| Cond == ISD::SETULE) + // bit 3 and above for CC==0, bit 0 and above for CC==3 (always true), + // always true for CC>3. + C.CCMask = CC < 4 ? -1 << (3 - CC) : -1; + else if (Cond == ISD::SETGT \|\| Cond == ISD::SETUGT) + // ...and the inverse of that. + C.CCMask = CC < 4 ? ~(-1 << (3 - CC)) : 0; + else + llvm_unreachable("Unexpected integer comparison type"); + C.CCMask &= CCValid; + return C; +} + // Decide how to implement a comparison of type Cond between CmpOp0 with CmpOp1. static Comparison getCmp(SelectionDAG &DAG, SDValue CmpOp0, SDValue CmpOp1, ISD::CondCode Cond) { + if (CmpOp1.getOpcode() == ISD::Constant) { + uint64_t Constant = cast<ConstantSDNode>(CmpOp1)->getZExtValue(); + unsigned Opcode, CCValid; + if (CmpOp0.getOpcode() == ISD::INTRINSIC_W_CHAIN && + CmpOp0.getResNo() == 0 && CmpOp0->hasNUsesOfValue(1, 0) && + isIntrinsicWithCCAndChain(CmpOp0, Opcode, CCValid)) + return getIntrinsicCmp(DAG, Opcode, CmpOp0, CCValid, Constant, Cond); + } Comparison C(CmpOp0, CmpOp1); C.CCMask = CCMaskForCondCode(Cond); if (C.Op0.getValueType().isFloatingPoint()) { @@ -1632,6 +1727,17 @@ static Comparison getCmp(SelectionDAG &D // Emit the comparison instruction described by C. static SDValue emitCmp(SelectionDAG &DAG, SDLoc DL, Comparison &C) { + if (!C.Op1.getNode()) { + SDValue Op; + switch (C.Op0.getOpcode()) { + case ISD::INTRINSIC_W_CHAIN: + Op = emitIntrinsicWithChainAndGlue(DAG, C.Op0, C.Opcode); + break; + default: + llvm_unreachable("Invalid comparison operands"); + } + return SDValue(Op.getNode(), Op->getNumValues() - 1); + } if (C.Opcode == SystemZISD::ICMP) return DAG.getNode(SystemZISD::ICMP, DL, MVT::Glue, C.Op0, C.Op1, DAG.getConstant(C.ICmpType, MVT::i32)); @@ -1713,7 +1819,6 @@ SDValue SystemZTargetLowering::lowerSETC } SDValue SystemZTargetLowering::lowerBR_CC(SDValue Op, SelectionDAG &DAG) const { - SDValue Chain = Op.getOperand(0); ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(1))->get(); SDValue CmpOp0 = Op.getOperand(2); SDValue CmpOp1 = Op.getOperand(3); @@ -1723,7 +1828,7 @@ SDValue SystemZTargetLowering::lowerBR_C Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC)); SDValue Glue = emitCmp(DAG, DL, C); return DAG.getNode(SystemZISD::BR_CCMASK, DL, Op.getValueType(), - Chain, DAG.getConstant(C.CCValid, MVT::i32), + Op.getOperand(0), DAG.getConstant(C.CCValid, MVT::i32), DAG.getConstant(C.CCMask, MVT::i32), Dest, Glue); } @@ -2561,6 +2666,30 @@ SDValue SystemZTargetLowering::lowerPREF Node->getMemoryVT(), Node->getMemOperand()); } +// Return an i32 that contains the value of CC immediately after After, +// whose final operand must be MVT::Glue. +static SDValue getCCResult(SelectionDAG &DAG, SDNode After) { + SDValue Glue = SDValue(After, After->getNumValues() - 1); + SDValue IPM = DAG.getNode(SystemZISD::IPM, SDLoc(After), MVT::i32, Glue); + return DAG.getNode(ISD::SRL, SDLoc(After), MVT::i32, IPM, + DAG.getConstant(SystemZ::IPM_CC, MVT::i32)); +} + +SDValue +SystemZTargetLowering::lowerINTRINSIC_W_CHAIN(SDValue Op, + SelectionDAG &DAG) const { + unsigned Opcode, CCValid; + if (isIntrinsicWithCCAndChain(Op, Opcode, CCValid)) { + assert(Op->getNumValues() == 2 && "Expected only CC result and chain"); + SDValue Glued = emitIntrinsicWithChainAndGlue(DAG, Op, Opcode); + SDValue CC = getCCResult(DAG, Glued.getNode()); + DAG.ReplaceAllUsesOfValueWith(SDValue(Op.getNode(), 0), CC); + return SDValue(); + } + + return SDValue(); +} + SDValue SystemZTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) const { switch (Op.getOpcode()) { @@ -2634,6 +2763,8 @@ SDValue SystemZTargetLowering::LowerOper return lowerSTACKRESTORE(Op, DAG); case ISD::PREFETCH: return lowerPREFETCH(Op, DAG); + case ISD::INTRINSIC_W_CHAIN: + return lowerINTRINSIC_W_CHAIN(Op, DAG); default: llvm_unreachable("Unexpected node to lower"); } @@ -2674,6 +2805,9 @@ const char SystemZTargetLowering::getTa OPCODE(SEARCH_STRING); OPCODE(IPM); OPCODE(SERIALIZE); + OPCODE(TBEGIN); + OPCODE(TBEGIN_NOFLOAT); + OPCODE(TEND); OPCODE(ATOMIC_SWAPW); OPCODE(ATOMIC_LOADW_ADD); OPCODE(ATOMIC_LOADW_SUB); @@ -3501,6 +3635,50 @@ SystemZTargetLowering::emitStringWrapper return DoneMBB; } +// Update TBEGIN instruction with final opcode and register clobbers. +MachineBasicBlock * +SystemZTargetLowering::emitTransactionBegin(MachineInstr MI, + MachineBasicBlock MBB, + unsigned Opcode, + bool NoFloat) const { + MachineFunction &MF = MBB->getParent(); + const TargetFrameLowering TFI = Subtarget.getFrameLowering(); + const SystemZInstrInfo TII = Subtarget.getInstrInfo(); + + // Update opcode. + MI->setDesc(TII->get(Opcode)); + + // We cannot handle a TBEGIN that clobbers the stack or frame pointer. + // Make sure to add the corresponding GRSM bits if they are missing. + uint64_t Control = MI->getOperand(2).getImm(); + static const unsigned GPRControlBit[16] = { + 0x8000, 0x8000, 0x4000, 0x4000, 0x2000, 0x2000, 0x1000, 0x1000, + 0x0800, 0x0800, 0x0400, 0x0400, 0x0200, 0x0200, 0x0100, 0x0100 + }; + Control \|= GPRControlBit[15]; + if (TFI->hasFP(MF)) + Control \|= GPRControlBit[11]; + MI->getOperand(2).setImm(Control); + + // Add GPR clobbers. + for (int I = 0; I < 16; I++) { + if ((Control & GPRControlBit[I]) == 0) { + unsigned Reg = SystemZMC::GR64Regs[I]; + MI->addOperand(MachineOperand::CreateReg(Reg, true, true)); + } + } + + // Add FPR clobbers. + if (!NoFloat && (Control & 4) != 0) { + for (int I = 0; I < 16; I++) { + unsigned Reg = SystemZMC::FP64Regs[I]; + MI->addOperand(MachineOperand::CreateReg(Reg, true, true)); + } + } + + return MBB; +} + MachineBasicBlock SystemZTargetLowering:: EmitInstrWithCustomInserter(MachineInstr MI, MachineBasicBlock MBB) const { switch (MI->getOpcode()) { @@ -3742,6 +3920,12 @@ EmitInstrWithCustomInserter(MachineInstr return emitStringWrapper(MI, MBB, SystemZ::MVST); case SystemZ::SRSTLoop: return emitStringWrapper(MI, MBB, SystemZ::SRST); + case SystemZ::TBEGIN: + return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, false); + case SystemZ::TBEGIN_nofloat: + return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, true); + case SystemZ::TBEGINC: + return emitTransactionBegin(MI, MBB, SystemZ::TBEGINC, true); default: llvm_unreachable("Unexpected instr type to insert"); } Index: llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll =================================================================== --- /dev/null +++ llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll @@ -0,0 +1,352 @@ +; Test transactional-execution intrinsics. +; +; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=zEC12 \| FileCheck %s + +declare i32 @llvm.s390.tbegin(i8 , i32) +declare i32 @llvm.s390.tbegin.nofloat(i8 , i32) +declare void @llvm.s390.tbeginc(i8 , i32) +declare i32 @llvm.s390.tend() +declare void @llvm.s390.tabort(i64) +declare void @llvm.s390.ntstg(i64, i64 ) +declare i32 @llvm.s390.etnd() +declare void @llvm.s390.ppa.txassist(i32) + +; TBEGIN. +define void @test_tbegin() { +; CHECK-LABEL: test_tbegin: +; CHECK-NOT: stmg +; CHECK: std %f8, +; CHECK: std %f9, +; CHECK: std %f10, +; CHECK: std %f11, +; CHECK: std %f12, +; CHECK: std %f13, +; CHECK: std %f14, +; CHECK: std %f15, +; CHECK: tbegin 0, 65292 +; CHECK: ld %f8, +; CHECK: ld %f9, +; CHECK: ld %f10, +; CHECK: ld %f11, +; CHECK: ld %f12, +; CHECK: ld %f13, +; CHECK: ld %f14, +; CHECK: ld %f15, +; CHECK: br %r14 + call i32 @llvm.s390.tbegin(i8 null, i32 65292) + ret void +} + +; TBEGIN (nofloat). +define void @test_tbegin_nofloat1() { +; CHECK-LABEL: test_tbegin_nofloat1: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0, 65292 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 65292) + ret void +} + +; TBEGIN (nofloat) with integer CC return value. +define i32 @test_tbegin_nofloat2() { +; CHECK-LABEL: test_tbegin_nofloat2: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0, 65292 +; CHECK: ipm %r2 +; CHECK: srl %r2, 28 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 65292) + ret i32 %res +} + +; TBEGIN (nofloat) with implicit CC check. +define void @test_tbegin_nofloat3(i32 %ptr) { +; CHECK-LABEL: test_tbegin_nofloat3: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0, 65292 +; CHECK: jnh {{\.L}} +; CHECK: mvhi 0(%r2), 0 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 65292) + %cmp = icmp eq i32 %res, 2 + br i1 %cmp, label %if.then, label %if.end + +if.then: ; preds = %entry + store i32 0, i32* %ptr, align 4 + br label %if.end + +if.end: ; preds = %if.then, %entry + ret void +} + +; TBEGIN (nofloat) with dual CC use. +define i32 @test_tbegin_nofloat4(i32 %pad, i32 %ptr) { +; CHECK-LABEL: test_tbegin_nofloat4: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0, 65292 +; CHECK: ipm %r2 +; CHECK: srl %r2, 28 +; CHECK: cijlh %r2, 2, {{\.L}} +; CHECK: mvhi 0(%r3), 0 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 65292) + %cmp = icmp eq i32 %res, 2 + br i1 %cmp, label %if.then, label %if.end + +if.then: ; preds = %entry + store i32 0, i32 %ptr, align 4 + br label %if.end + +if.end: ; preds = %if.then, %entry + ret i32 %res +} + +; TBEGIN (nofloat) with register. +define void @test_tbegin_nofloat5(i8 %ptr) { +; CHECK-LABEL: test_tbegin_nofloat5: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0(%r2), 65292 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 %ptr, i32 65292) + ret void +} + +; TBEGIN (nofloat) with GRSM 0x0f00. +define void @test_tbegin_nofloat6() { +; CHECK-LABEL: test_tbegin_nofloat6: +; CHECK: stmg %r6, %r15, +; CHECK-NOT: std +; CHECK: tbegin 0, 3840 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 3840) + ret void +} + +; TBEGIN (nofloat) with GRSM 0xf100. +define void @test_tbegin_nofloat7() { +; CHECK-LABEL: test_tbegin_nofloat7: +; CHECK: stmg %r8, %r15, +; CHECK-NOT: std +; CHECK: tbegin 0, 61696 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 61696) + ret void +} + +; TBEGIN (nofloat) with GRSM 0xfe00 -- stack pointer added automatically. +define void @test_tbegin_nofloat8() { +; CHECK-LABEL: test_tbegin_nofloat8: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbegin 0, 65280 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 65024) + ret void +} + +; TBEGIN (nofloat) with GRSM 0xfb00 -- no frame pointer needed. +define void @test_tbegin_nofloat9() { +; CHECK-LABEL: test_tbegin_nofloat9: +; CHECK: stmg %r10, %r15, +; CHECK-NOT: std +; CHECK: tbegin 0, 64256 +; CHECK: br %r14 + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 64256) + ret void +} + +; TBEGIN (nofloat) with GRSM 0xfb00 -- frame pointer added automatically. +define void @test_tbegin_nofloat10(i64 %n) { +; CHECK-LABEL: test_tbegin_nofloat10: +; CHECK: stmg %r11, %r15, +; CHECK-NOT: std +; CHECK: tbegin 0, 65280 +; CHECK: br %r14 + %buf = alloca i8, i64 %n + call i32 @llvm.s390.tbegin.nofloat(i8 null, i32 64256) + ret void +} + +; TBEGINC. +define void @test_tbeginc() { +; CHECK-LABEL: test_tbeginc: +; CHECK-NOT: stmg +; CHECK-NOT: std +; CHECK: tbeginc 0, 65288 +; CHECK: br %r14 + call void @llvm.s390.tbeginc(i8 null, i32 65288) + ret void +} + +; TEND with integer CC return value. +define i32 @test_tend1() { +; CHECK-LABEL: test_tend1: +; CHECK: tend +; CHECK: ipm %r2 +; CHECK: srl %r2, 28 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tend() + ret i32 %res +} + +; TEND with implicit CC check. +define void @test_tend3(i32 %ptr) { +; CHECK-LABEL: test_tend3: +; CHECK: tend +; CHECK: je {{\.L}} +; CHECK: mvhi 0(%r2), 0 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tend() + %cmp = icmp eq i32 %res, 2 + br i1 %cmp, label %if.then, label %if.end + +if.then: ; preds = %entry + store i32 0, i32* %ptr, align 4 + br label %if.end + +if.end: ; preds = %if.then, %entry + ret void +} + +; TEND with dual CC use. +define i32 @test_tend2(i32 %pad, i32 %ptr) { +; CHECK-LABEL: test_tend2: +; CHECK: tend +; CHECK: ipm %r2 +; CHECK: srl %r2, 28 +; CHECK: cijlh %r2, 2, {{\.L}} +; CHECK: mvhi 0(%r3), 0 +; CHECK: br %r14 + %res = call i32 @llvm.s390.tend() + %cmp = icmp eq i32 %res, 2 + br i1 %cmp, label %if.then, label %if.end + +if.then: ; preds = %entry + store i32 0, i32* %ptr, align 4 + br label %if.end + +if.end: ; preds = %if.then, %entry + ret i32 %res +} + +; TABORT with register only. +define void @test_tabort1(i64 %val) { +; CHECK-LABEL: test_tabort1: +; CHECK: tabort 0(%r2) +; CHECK: br %r14 + call void @llvm.s390.tabort(i64 %val) + ret void +} + +; TABORT with immediate only. +define void @test_tabort2(i64 %val) { +; CHECK-LABEL: test_tabort2: +; CHECK: tabort 1234 +; CHECK: br %r14 + call void @llvm.s390.tabort(i64 1234) + ret void +} + +; TABORT with register + immediate. +define void @test_tabort3(i64 %val) { +; CHECK-LABEL: test_tabort3: +; CHECK: tabort 1234(%r2) +; CHECK: br %r14 + %sum = add i64 %val, 1234 + call void @llvm.s390.tabort(i64 %sum) + ret void +} + +; TABORT with out-of-range immediate. +define void @test_tabort4(i64 %val) { +; CHECK-LABEL: test_tabort4: +; CHECK: tabort 0({{%r[1-5]}}) +; CHECK: br %r14 + call void @llvm.s390.tabort(i64 4096) + ret void +} + +; NTSTG with base pointer only. +define void @test_ntstg1(i64 %ptr, i64 %val) { +; CHECK-LABEL: test_ntstg1: +; CHECK: ntstg %r3, 0(%r2) +; CHECK: br %r14 + call void @llvm.s390.ntstg(i64 %val, i64 %ptr) + ret void +} + +; NTSTG with base and index. +; Check that VSTL doesn't allow an index. +define void @test_ntstg2(i64 %base, i64 %index, i64 %val) { +; CHECK-LABEL: test_ntstg2: +; CHECK: sllg [[REG:%r[1-5]]], %r3, 3 +; CHECK: ntstg %r4, 0([[REG]],%r2) +; CHECK: br %r14 + %ptr = getelementptr i64, i64 %base, i64 %index + call void @llvm.s390.ntstg(i64 %val, i64 %ptr) + ret void +} + +; NTSTG with the highest in-range displacement. +define void @test_ntstg3(i64 %base, i64 %val) { +; CHECK-LABEL: test_ntstg3: +; CHECK: ntstg %r3, 524280(%r2) +; CHECK: br %r14 + %ptr = getelementptr i64, i64 %base, i64 65535 + call void @llvm.s390.ntstg(i64 %val, i64 %ptr) + ret void +} + +; NTSTG with an out-of-range positive displacement. +define void @test_ntstg4(i64 %base, i64 %val) { +; CHECK-LABEL: test_ntstg4: +; CHECK: ntstg %r3, 0({{%r[1-5]}}) +; CHECK: br %r14 + %ptr = getelementptr i64, i64 %base, i64 65536 + call void @llvm.s390.ntstg(i64 %val, i64 %ptr) + ret void +} + +; NTSTG with the lowest in-range displacement. +define void @test_ntstg5(i64 %base, i64 %val) { +; CHECK-LABEL: test_ntstg5: +; CHECK: ntstg %r3, -524288(%r2) +; CHECK: br %r14 + %ptr = getelementptr i64, i64 %base, i64 -65536 + call void @llvm.s390.ntstg(i64 %val, i64 %ptr) + ret void +} + +; NTSTG with an out-of-range negative displacement. +define void @test_ntstg6(i64 %base, i64 %val) { +; CHECK-LABEL: test_ntstg6: +; CHECK: ntstg %r3, 0({{%r[1-5]}}) +; CHECK: br %r14 + %ptr = getelementptr i64, i64 %base, i64 -65537 + call void @llvm.s390.ntstg(i64 %val, i64 *%ptr) + ret void +} + +; ETND. +define i32 @test_etnd() { +; CHECK-LABEL: test_etnd: +; CHECK: etnd %r2 +; CHECK: br %r14 + %res = call i32 @llvm.s390.etnd() + ret i32 %res +} + +; PPA (Transaction-Abort Assist) +define void @test_ppa_txassist(i32 %val) { +; CHECK-LABEL: test_ppa_txassist: +; CHECK: ppa %r2, 0, 1 +; CHECK: br %r14 + call void @llvm.s390.ppa.txassist(i32 %val) + ret void +} + Index: llvm-head/test/MC/SystemZ/insn-bad-zEC12.s =================================================================== --- llvm-head.orig/test/MC/SystemZ/insn-bad-zEC12.s +++ llvm-head/test/MC/SystemZ/insn-bad-zEC12.s @@ -3,6 +3,22 @@ # RUN: FileCheck < %t %s #CHECK: error: invalid operand +#CHECK: ntstg %r0, -524289 +#CHECK: error: invalid operand +#CHECK: ntstg %r0, 524288 + + ntstg %r0, -524289 + ntstg %r0, 524288 + +#CHECK: error: invalid operand +#CHECK: ppa %r0, %r0, -1 +#CHECK: error: invalid operand +#CHECK: ppa %r0, %r0, 16 + + ppa %r0, %r0, -1 + ppa %r0, %r0, 16 + +#CHECK: error: invalid operand #CHECK: risbgn %r0,%r0,0,0,-1 #CHECK: error: invalid operand #CHECK: risbgn %r0,%r0,0,0,64 @@ -22,3 +38,47 @@ risbgn %r0,%r0,-1,0,0 risbgn %r0,%r0,256,0,0 +#CHECK: error: invalid operand +#CHECK: tabort -1 +#CHECK: error: invalid operand +#CHECK: tabort 4096 +#CHECK: error: invalid use of indexed addressing +#CHECK: tabort 0(%r1,%r2) + + tabort -1 + tabort 4096 + tabort 0(%r1,%r2) + +#CHECK: error: invalid operand +#CHECK: tbegin -1, 0 +#CHECK: error: invalid operand +#CHECK: tbegin 4096, 0 +#CHECK: error: invalid use of indexed addressing +#CHECK: tbegin 0(%r1,%r2), 0 +#CHECK: error: invalid operand +#CHECK: tbegin 0, -1 +#CHECK: error: invalid operand +#CHECK: tbegin 0, 65536 + + tbegin -1, 0 + tbegin 4096, 0 + tbegin 0(%r1,%r2), 0 + tbegin 0, -1 + tbegin 0, 65536 + +#CHECK: error: invalid operand +#CHECK: tbeginc -1, 0 +#CHECK: error: invalid operand +#CHECK: tbeginc 4096, 0 +#CHECK: error: invalid use of indexed addressing +#CHECK: tbeginc 0(%r1,%r2), 0 +#CHECK: error: invalid operand +#CHECK: tbeginc 0, -1 +#CHECK: error: invalid operand +#CHECK: tbeginc 0, 65536 + + tbeginc -1, 0 + tbeginc 4096, 0 + tbeginc 0(%r1,%r2), 0 + tbeginc 0, -1 + tbeginc 0, 65536 Index: llvm-head/test/MC/SystemZ/insn-good-zEC12.s =================================================================== --- llvm-head.orig/test/MC/SystemZ/insn-good-zEC12.s +++ llvm-head/test/MC/SystemZ/insn-good-zEC12.s @@ -1,6 +1,48 @@ # For zEC12 and above. # RUN: llvm-mc -triple s390x-linux-gnu -mcpu=zEC12 -show-encoding %s \| FileCheck %s +#CHECK: etnd %r0 # encoding: [0xb2,0xec,0x00,0x00] +#CHECK: etnd %r15 # encoding: [0xb2,0xec,0x00,0xf0] +#CHECK: etnd %r7 # encoding: [0xb2,0xec,0x00,0x70] + + etnd %r0 + etnd %r15 + etnd %r7 + +#CHECK: ntstg %r0, -524288 # encoding: [0xe3,0x00,0x00,0x00,0x80,0x25] +#CHECK: ntstg %r0, -1 # encoding: [0xe3,0x00,0x0f,0xff,0xff,0x25] +#CHECK: ntstg %r0, 0 # encoding: [0xe3,0x00,0x00,0x00,0x00,0x25] +#CHECK: ntstg %r0, 1 # encoding: [0xe3,0x00,0x00,0x01,0x00,0x25] +#CHECK: ntstg %r0, 524287 # encoding: [0xe3,0x00,0x0f,0xff,0x7f,0x25] +#CHECK: ntstg %r0, 0(%r1) # encoding: [0xe3,0x00,0x10,0x00,0x00,0x25] +#CHECK: ntstg %r0, 0(%r15) # encoding: [0xe3,0x00,0xf0,0x00,0x00,0x25] +#CHECK: ntstg %r0, 524287(%r1,%r15) # encoding: [0xe3,0x01,0xff,0xff,0x7f,0x25] +#CHECK: ntstg %r0, 524287(%r15,%r1) # encoding: [0xe3,0x0f,0x1f,0xff,0x7f,0x25] +#CHECK: ntstg %r15, 0 # encoding: [0xe3,0xf0,0x00,0x00,0x00,0x25] + + ntstg %r0, -524288 + ntstg %r0, -1 + ntstg %r0, 0 + ntstg %r0, 1 + ntstg %r0, 524287 + ntstg %r0, 0(%r1) + ntstg %r0, 0(%r15) + ntstg %r0, 524287(%r1,%r15) + ntstg %r0, 524287(%r15,%r1) + ntstg %r15, 0 + +#CHECK: ppa %r0, %r0, 0 # encoding: [0xb2,0xe8,0x00,0x00] +#CHECK: ppa %r0, %r0, 15 # encoding: [0xb2,0xe8,0xf0,0x00] +#CHECK: ppa %r0, %r15, 0 # encoding: [0xb2,0xe8,0x00,0x0f] +#CHECK: ppa %r4, %r6, 7 # encoding: [0xb2,0xe8,0x70,0x46] +#CHECK: ppa %r15, %r0, 0 # encoding: [0xb2,0xe8,0x00,0xf0] + + ppa %r0, %r0, 0 + ppa %r0, %r0, 15 + ppa %r0, %r15, 0 + ppa %r4, %r6, 7 + ppa %r15, %r0, 0 + #CHECK: risbgn %r0, %r0, 0, 0, 0 # encoding: [0xec,0x00,0x00,0x00,0x00,0x59] #CHECK: risbgn %r0, %r0, 0, 0, 63 # encoding: [0xec,0x00,0x00,0x00,0x3f,0x59] #CHECK: risbgn %r0, %r0, 0, 255, 0 # encoding: [0xec,0x00,0x00,0xff,0x00,0x59] @@ -17,3 +59,68 @@ risbgn %r15,%r0,0,0,0 risbgn %r4,%r5,6,7,8 +#CHECK: tabort 0 # encoding: [0xb2,0xfc,0x00,0x00] +#CHECK: tabort 0(%r1) # encoding: [0xb2,0xfc,0x10,0x00] +#CHECK: tabort 0(%r15) # encoding: [0xb2,0xfc,0xf0,0x00] +#CHECK: tabort 4095 # encoding: [0xb2,0xfc,0x0f,0xff] +#CHECK: tabort 4095(%r1) # encoding: [0xb2,0xfc,0x1f,0xff] +#CHECK: tabort 4095(%r15) # encoding: [0xb2,0xfc,0xff,0xff] + + tabort 0 + tabort 0(%r1) + tabort 0(%r15) + tabort 4095 + tabort 4095(%r1) + tabort 4095(%r15) + +#CHECK: tbegin 0, 0 # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00] +#CHECK: tbegin 4095, 0 # encoding: [0xe5,0x60,0x0f,0xff,0x00,0x00] +#CHECK: tbegin 0, 0 # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00] +#CHECK: tbegin 0, 1 # encoding: [0xe5,0x60,0x00,0x00,0x00,0x01] +#CHECK: tbegin 0, 32767 # encoding: [0xe5,0x60,0x00,0x00,0x7f,0xff] +#CHECK: tbegin 0, 32768 # encoding: [0xe5,0x60,0x00,0x00,0x80,0x00] +#CHECK: tbegin 0, 65535 # encoding: [0xe5,0x60,0x00,0x00,0xff,0xff] +#CHECK: tbegin 0(%r1), 42 # encoding: [0xe5,0x60,0x10,0x00,0x00,0x2a] +#CHECK: tbegin 0(%r15), 42 # encoding: [0xe5,0x60,0xf0,0x00,0x00,0x2a] +#CHECK: tbegin 4095(%r1), 42 # encoding: [0xe5,0x60,0x1f,0xff,0x00,0x2a] +#CHECK: tbegin 4095(%r15), 42 # encoding: [0xe5,0x60,0xff,0xff,0x00,0x2a] + + tbegin 0, 0 + tbegin 4095, 0 + tbegin 0, 0 + tbegin 0, 1 + tbegin 0, 32767 + tbegin 0, 32768 + tbegin 0, 65535 + tbegin 0(%r1), 42 + tbegin 0(%r15), 42 + tbegin 4095(%r1), 42 + tbegin 4095(%r15), 42 + +#CHECK: tbeginc 0, 0 # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00] +#CHECK: tbeginc 4095, 0 # encoding: [0xe5,0x61,0x0f,0xff,0x00,0x00] +#CHECK: tbeginc 0, 0 # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00] +#CHECK: tbeginc 0, 1 # encoding: [0xe5,0x61,0x00,0x00,0x00,0x01] +#CHECK: tbeginc 0, 32767 # encoding: [0xe5,0x61,0x00,0x00,0x7f,0xff] +#CHECK: tbeginc 0, 32768 # encoding: [0xe5,0x61,0x00,0x00,0x80,0x00] +#CHECK: tbeginc 0, 65535 # encoding: [0xe5,0x61,0x00,0x00,0xff,0xff] +#CHECK: tbeginc 0(%r1), 42 # encoding: [0xe5,0x61,0x10,0x00,0x00,0x2a] +#CHECK: tbeginc 0(%r15), 42 # encoding: [0xe5,0x61,0xf0,0x00,0x00,0x2a] +#CHECK: tbeginc 4095(%r1), 42 # encoding: [0xe5,0x61,0x1f,0xff,0x00,0x2a] +#CHECK: tbeginc 4095(%r15), 42 # encoding: [0xe5,0x61,0xff,0xff,0x00,0x2a] + + tbeginc 0, 0 + tbeginc 4095, 0 + tbeginc 0, 0 + tbeginc 0, 1 + tbeginc 0, 32767 + tbeginc 0, 32768 + tbeginc 0, 65535 + tbeginc 0(%r1), 42 + tbeginc 0(%r15), 42 + tbeginc 4095(%r1), 42 + tbeginc 4095(%r15), 42 + +#CHECK: tend # encoding: [0xb2,0xf8,0x00,0x00] + + tend Index: llvm-head/test/MC/SystemZ/insn-bad-z196.s =================================================================== --- llvm-head.orig/test/MC/SystemZ/insn-bad-z196.s +++ llvm-head/test/MC/SystemZ/insn-bad-z196.s @@ -244,6 +244,11 @@ cxlgbr %f0, 16, %r0, 0 cxlgbr %f2, 0, %r0, 0 +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: etnd %r7 + + etnd %r7 + #CHECK: error: invalid operand #CHECK: fidbra %f0, 0, %f0, -1 #CHECK: error: invalid operand @@ -546,6 +551,16 @@ locr %r0,%r0,-1 locr %r0,%r0,16 +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: ntstg %r0, 524287(%r1,%r15) + + ntstg %r0, 524287(%r1,%r15) + +#CHECK: error: {{(instruction requires: processor-assist)?}} +#CHECK: ppa %r4, %r6, 7 + + ppa %r4, %r6, 7 + #CHECK: error: {{(instruction requires: miscellaneous-extensions)?}} #CHECK: risbgn %r1, %r2, 0, 0, 0 @@ -690,3 +705,24 @@ stocg %r0,-524289,1 stocg %r0,524288,1 stocg %r0,0(%r1,%r2),1 + +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: tabort 4095(%r1) + + tabort 4095(%r1) + +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: tbegin 4095(%r1), 42 + + tbegin 4095(%r1), 42 + +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: tbeginc 4095(%r1), 42 + + tbeginc 4095(%r1), 42 + +#CHECK: error: {{(instruction requires: transactional-execution)?}} +#CHECK: tend + + tend + Index: llvm-head/test/MC/Disassembler/SystemZ/insns.txt =================================================================== --- llvm-head.orig/test/MC/Disassembler/SystemZ/insns.txt +++ llvm-head/test/MC/Disassembler/SystemZ/insns.txt @@ -2503,6 +2503,15 @@ # CHECK: ear %r15, %a15 0xb2 0x4f 0x00 0xff +# CHECK: etnd %r0 +0xb2 0xec 0x00 0x00 + +# CHECK: etnd %r15 +0xb2 0xec 0x00 0xf0 + +# CHECK: etnd %r7 +0xb2 0xec 0x00 0x70 + # CHECK: fidbr %f0, 0, %f0 0xb3 0x5f 0x00 0x00 @@ -6034,6 +6043,36 @@ # CHECK: ny %r15, 0 0xe3 0xf0 0x00 0x00 0x00 0x54 +# CHECK: ntstg %r0, -524288 +0xe3 0x00 0x00 0x00 0x80 0x25 + +# CHECK: ntstg %r0, -1 +0xe3 0x00 0x0f 0xff 0xff 0x25 + +# CHECK: ntstg %r0, 0 +0xe3 0x00 0x00 0x00 0x00 0x25 + +# CHECK: ntstg %r0, 1 +0xe3 0x00 0x00 0x01 0x00 0x25 + +# CHECK: ntstg %r0, 524287 +0xe3 0x00 0x0f 0xff 0x7f 0x25 + +# CHECK: ntstg %r0, 0(%r1) +0xe3 0x00 0x10 0x00 0x00 0x25 + +# CHECK: ntstg %r0, 0(%r15) +0xe3 0x00 0xf0 0x00 0x00 0x25 + +# CHECK: ntstg %r0, 524287(%r1,%r15) +0xe3 0x01 0xff 0xff 0x7f 0x25 + +# CHECK: ntstg %r0, 524287(%r15,%r1) +0xe3 0x0f 0x1f 0xff 0x7f 0x25 + +# CHECK: ntstg %r15, 0 +0xe3 0xf0 0x00 0x00 0x00 0x25 + # CHECK: oc 0(1), 0 0xd6 0x00 0x00 0x00 0x00 0x00 @@ -6346,6 +6385,21 @@ # CHECK: popcnt %r7, %r8 0xb9 0xe1 0x00 0x78 +# CHECK: ppa %r0, %r0, 0 +0xb2 0xe8 0x00 0x00 + +# CHECK: ppa %r0, %r0, 15 +0xb2 0xe8 0xf0 0x00 + +# CHECK: ppa %r0, %r15, 0 +0xb2 0xe8 0x00 0x0f + +# CHECK: ppa %r4, %r6, 7 +0xb2 0xe8 0x70 0x46 + +# CHECK: ppa %r15, %r0, 0 +0xb2 0xe8 0x00 0xf0 + # CHECK: risbg %r0, %r0, 0, 0, 0 0xec 0x00 0x00 0x00 0x00 0x55 @@ -8062,6 +8116,93 @@ # CHECK: sy %r15, 0 0xe3 0xf0 0x00 0x00 0x00 0x5b +# CHECK: tabort 0 +0xb2 0xfc 0x00 0x00 + +# CHECK: tabort 0(%r1) +0xb2 0xfc 0x10 0x00 + +# CHECK: tabort 0(%r15) +0xb2 0xfc 0xf0 0x00 + +# CHECK: tabort 4095 +0xb2 0xfc 0x0f 0xff + +# CHECK: tabort 4095(%r1) +0xb2 0xfc 0x1f 0xff + +# CHECK: tabort 4095(%r15) +0xb2 0xfc 0xff 0xff + +# CHECK: tbegin 0, 0 +0xe5 0x60 0x00 0x00 0x00 0x00 + +# CHECK: tbegin 4095, 0 +0xe5 0x60 0x0f 0xff 0x00 0x00 + +# CHECK: tbegin 0, 0 +0xe5 0x60 0x00 0x00 0x00 0x00 + +# CHECK: tbegin 0, 1 +0xe5 0x60 0x00 0x00 0x00 0x01 + +# CHECK: tbegin 0, 32767 +0xe5 0x60 0x00 0x00 0x7f 0xff + +# CHECK: tbegin 0, 32768 +0xe5 0x60 0x00 0x00 0x80 0x00 + +# CHECK: tbegin 0, 65535 +0xe5 0x60 0x00 0x00 0xff 0xff + +# CHECK: tbegin 0(%r1), 42 +0xe5 0x60 0x10 0x00 0x00 0x2a + +# CHECK: tbegin 0(%r15), 42 +0xe5 0x60 0xf0 0x00 0x00 0x2a + +# CHECK: tbegin 4095(%r1), 42 +0xe5 0x60 0x1f 0xff 0x00 0x2a + +# CHECK: tbegin 4095(%r15), 42 +0xe5 0x60 0xff 0xff 0x00 0x2a + +# CHECK: tbeginc 0, 0 +0xe5 0x61 0x00 0x00 0x00 0x00 + +# CHECK: tbeginc 4095, 0 +0xe5 0x61 0x0f 0xff 0x00 0x00 + +# CHECK: tbeginc 0, 0 +0xe5 0x61 0x00 0x00 0x00 0x00 + +# CHECK: tbeginc 0, 1 +0xe5 0x61 0x00 0x00 0x00 0x01 + +# CHECK: tbeginc 0, 32767 +0xe5 0x61 0x00 0x00 0x7f 0xff + +# CHECK: tbeginc 0, 32768 +0xe5 0x61 0x00 0x00 0x80 0x00 + +# CHECK: tbeginc 0, 65535 +0xe5 0x61 0x00 0x00 0xff 0xff + +# CHECK: tbeginc 0(%r1), 42 +0xe5 0x61 0x10 0x00 0x00 0x2a + +# CHECK: tbeginc 0(%r15), 42 +0xe5 0x61 0xf0 0x00 0x00 0x2a + +# CHECK: tbeginc 4095(%r1), 42 +0xe5 0x61 0x1f 0xff 0x00 0x2a + +# CHECK: tbeginc 4095(%r15), 42 +0xe5 0x61 0xff 0xff 0x00 0x2a + +# CHECK: tend +0xb2 0xf8 0x00 0x00 + # CHECK: tm 0, 0 0x91 0x00 0x00 0x00 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233803 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-01 12:51:43 +00:00
Hal Finkel	54ab6ce385	[PowerPC] FastISel can't handle i1 return values when using CR bits Under normal circumstances, use of CR bits is disabled when running at -O0, but it is enabled by default otherwise, and if you have optnone functions, they'll still generally be generated with crbits turned on (because nothing else turns them off). FastISel can't handle most things dealing with i1 values when using CR bits, and checks for that, but was not checking the return type on functions; we can't fast-isel function calls with i1 return values either when using CR bits for boolean values. Fixes PR22664. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233775 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-01 00:40:48 +00:00
David Majnemer	64386621ec	[WinEH] Generate .xdata for catch handlers This lets us catch exceptions in simple cases. N.B. Things that do not work include (but are not limited to): - Throwing from within a catch handler. - Catching an object with a named catch parameter. - 'CatchHigh' is fictitious, we aren't sure of its purpose. - We aren't entirely efficient with regards to the number of EH states that we generate. - IP-to-State tables are sensitive to the order of emission. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233767 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 22:35:44 +00:00
Sanjay Patel	ace5d1d606	typo; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233761 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 21:24:47 +00:00
Hal Finkel	72cce21049	[PowerPC] Don't use a vector preferred memory type at -O0 Even at -O0, we fall back to SDAG when we hit intrinsics, and if the intrinsic is a memset/memcpy/etc. we might normally use vector types. At -O0, this is probably not a good idea (because, if there is a bug in the lowering code, there would be no good way to turn it off). At -O0, only use scalar preferred types. Related to PR22754. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 20:56:09 +00:00
Quentin Colombet	6aebd393f0	[AArch64] Enable the codegenprepare optimization that promotes operation to form extended loads. Implement the related target lowering hook so that the optimization has a better estimation of the cost of an extension. rdar://problem/19267165 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233753 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 20:52:32 +00:00
Simon Atanasyan	fcadd6d973	[Hexagon] Avoid an unused variable warning when assertions are off No functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233740 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 19:43:47 +00:00
Ulrich Weigand	a9204bb532	[SystemZ] Address review comments for r233689 Change lowerCTPOP to: - Gracefully handle a known-zero input value - Simplify computation of significant bit size Thanks to Jay Foad for the review! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233736 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 19:28:50 +00:00
Sanjay Patel	1b10376319	[X86, AVX] fix zero-extending integer operand load patterns to use integer instructions This is a follow-on to r233704 and another partial fix for PR22685: https://llvm.org/bugs/show_bug.cgi?id=22685 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233724 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 18:43:43 +00:00
Sanjay Patel	7ea151449d	[X86, AVX] try to lowerVectorShuffleAsElementInsertion() for all 256-bit vector sub-types I suggested this change in D7898 (http://llvm.org/viewvc/llvm-project?view=revision&revision=231354) It improves the v4i64 case although not optimally. This AVX codegen: vmovq {{.#+}} xmm0 = mem[0],zero vxorpd %ymm1, %ymm1, %ymm1 vblendpd {{.#+}} ymm0 = ymm0[0],ymm1[1,2,3] Becomes: vmovsd {{.*#+}} xmm0 = mem[0],zero Unfortunately, this doesn't completely solve PR22685. There are still at least 2 problems under here: We're not handling v32i8 / v16i16. We're not getting the FP / int domains right for instruction selection. But since this patch alone appears to do no harm, reduces code duplication, and helps v4i64, I'm submitting this patch ahead of fixing the above. Differential Revision: http://reviews.llvm.org/D8341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233704 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 16:32:11 +00:00
Ulrich Weigand	d2b60b4674	[SystemZ] Add Analysis to required_libraries (fall-out from r233688) Should fix build failures with cmake builds git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233700 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 15:16:13 +00:00
Krzysztof Parzyszek	4654bc762e	Expand MUX instructions early on Hexagon This time with all files included. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233696 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 13:35:12 +00:00
Krzysztof Parzyszek	b7c19b3cc9	Revert 233694. Weak SVN-fu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233695 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 13:32:32 +00:00
Krzysztof Parzyszek	af4ad2d843	Expand MUX instructions early on Hexagon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233694 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 13:29:17 +00:00
Vladimir Sukharev	e99524cf52	[AArch64] Add v8.1a "Rounding Double Multiply Add/Subtract" extension Reviewers: t.p.northover, jmolloy Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D8502 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233693 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-31 13:15:48 +00:00

1 2 3 4 5 ...

33329 Commits