llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-21 18:29:46 +00:00

Author	SHA1	Message	Date
Justin Holewinski	0292a66bb1	[NVPTX] Handle addrspacecast constant expressions in aggregate initializers We need to track if an AddrSpaceCast expression was seen when generating an MCExpr for a ConstantExpr. This change introduces a custom lowerConstant method to the NVPTX asm printer that will create NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode the information that a given symbol needs to be casted to a generic address. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236000 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-28 17:18:30 +00:00
Jingyue Wu	b55e9545f2	[NVPTX] Emits "generic()" depending on the original address space Summary: Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()" on aggregates of addrspacecasts. Test Plan: addrspacecast-gvar.ll Reviewers: eliben, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D9130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235689 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-24 02:57:30 +00:00
David Blaikie	32b845d223	[opaque pointer type] Add textual IR support for explicit type parameter to the call instruction See r230786 and r230794 for similar changes to gep and load respectively. Call is a bit different because it often doesn't have a single explicit type - usually the type is deduced from the arguments, and just the return type is explicit. In those cases there's no need to change the IR. When that's not the case, the IR usually contains the pointer type of the first operand - but since typed pointers are going away, that representation is insufficient so I'm just stripping the "pointerness" of the explicit type away. This does make the IR a bit weird - it /sort of/ reads like the type of the first operand: "call void () %x(" but %x is actually of type "void ()" and will eventually be just of type "ptr". But this seems not too bad and I don't think it would benefit from repeating the type ("void (), void () %x(" and then eventually "void (), ptr %x(") as has been done with gep and load. This also has a side benefit: since the explicit type is no longer a pointer, there's no ambiguity between an explicit type and a function that returns a function pointer. Previously this case needed an explicit type (eg: a function returning a void() function was written as "call void () () * @x(" rather than "call void () * @x(" because of the ambiguity between a function returning a pointer to a void() function and a function returning void). No ambiguity means even function pointer return types can just be written alone, without writing the whole function's type. This leaves /only/ the varargs case where the explicit type is required. Given the special type syntax in call instructions, the regex-fu used for migration was a bit more involved in its own unique way (as every one of these is) so here it is. Use it in conjunction with the apply.sh script and associated find/xargs commands I've provided in rr230786 to migrate your out of tree tests. Do let me know if any of this doesn't cover your cases & we can iterate on a more general script/regexes to help others with out of tree tests. About 9 test cases couldn't be automatically migrated - half of those were functions returning function pointers, where I just had to manually delete the function argument types now that we didn't need an explicit function type there. The other half were typedefs of function types used in calls - just had to manually drop the * from those. import fileinput import sys import re pat = re.compile(r'((?:=\|:\|^\|\s)call\s(?:[^@]?))(\s$\|\s(?:(?:\[\[[a-zA-Z0-9_]+\]\]\|[@%](?:(")?[\\\?@a-zA-Z0-9_.]?(?(3)"\|)\|{{.}}))(?:$\|$)\|undef\|inttoptr\|bitcast\|null\|asm).$)') addrspace_end = re.compile(r"addrspace\(\d+$\s\$") func_end = re.compile("(?:void.\|\)\s)\$") def conv(match, line): if not match or re.search(addrspace_end, match.group(1)) or not re.search(func_end, match.group(1)): return line return line[:match.start()] + match.group(1)[:match.group(1).rfind('')].rstrip() + match.group(2) + line[match.end():] for line in sys.stdin: sys.stdout.write(conv(re.search(pat, line), line)) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235145 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-16 23:24:18 +00:00
Jan Vesely	a017ce21ba	Revert revisions r234755, r234759, r234760 Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)" Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO" Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB" Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll on hexagon, nvptx, and r600. Revert while I investigate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234768 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 17:47:15 +00:00
Jan Vesely	187ac42686	LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB v2: consider BooleanContents when processing overflow Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewers: resistor, jholewinsky (nvidia parts) Differential Revision: http://reviews.llvm.org/D6340 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-13 15:32:01 +00:00
Justin Holewinski	e6d6461067	[NVPTX] Associate a minimum PTX version for each SM architecture When a new SM architecture is introduced, it is only supported by the current PTX version and later. Make sure we are using at least the minimum PTX version for the target architecture. This also removes support for PTX ISA < 3.2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233583 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-30 19:30:55 +00:00
Justin Holewinski	d85053fc3f	[NVPTX] Add options for PTX 4.1/4.2 and SM 3.2/3.7/5.2/5.3 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233575 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-30 18:12:50 +00:00
Artem Belevich	97f4d01ee1	Add support for __nvvm_reflect changes in libdevice in CUDA-7.0 Summary: CUDA 7.0's libdevice uses slightly different IR to call __nvvm_reflect and that triggers an assertion in nvvm_reflect optimization pass. This change allows nvvm_reflect pass to deal with both old and new ways to pass an argument to __nvvm_reflect. Test Plan: ninja check-all Reviewers: eliben, echristo Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D8399 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232732 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-19 17:05:35 +00:00
David Blaikie	5a70dd1d82	[opaque pointer type] Add textual IR support for explicit type parameter to gep operator Similar to gep (r230786) and load (r230794) changes. Similar migration script can be used to update test cases, which successfully migrated all of LLVM and Polly, but about 4 test cases needed manually changes in Clang. (this script will read the contents of stdin and massage it into stdout - wrap it in the 'apply.sh' script shown in previous commits + xargs to apply it over a large set of test cases) import fileinput import sys import re rep = re.compile(r"(getelementptr(?:\s+inbounds)?\s$)((<\d\s+x\s+)?([^@]?)(\|\saddrspace\(\d+$)\s\(?(3)>)\s*)(?=$\|%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|zeroinitializer\|<\|\[\[[a-zA-Z]\|\{\{)", re.MULTILINE \| re.DOTALL) def conv(match): line = match.group(1) line += match.group(4) line += ", " line += match.group(2) return line line = sys.stdin.read() off = 0 for match in re.finditer(rep, line): sys.stdout.write(line[off:match.start()]) sys.stdout.write(conv(match)) off = match.end() sys.stdout.write(line[off:]) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232184 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-13 18:20:45 +00:00
Jingyue Wu	3ea0adcdd5	[NVPTXAsmPrinter] do not print .align on function headers Summary: PTX does not allow .align directives on function headers. Fixes PR21551. Test Plan: test/Codegen/NVPTX/function-align.ll Reviewers: eliben, jholewinski Reviewed By: eliben, jholewinski Subscribers: llvm-commits, eliben, jpienaar, jholewinski Differential Revision: http://reviews.llvm.org/D8274 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232004 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-12 01:50:30 +00:00
David Blaikie	7c9c6ed761	[opaque pointer type] Add textual IR support for explicit type parameter to load instruction Essentially the same as the GEP change in r230786. A similar migration script can be used to update test cases, though a few more test case improvements/changes were required this time around: (r229269-r229278) import fileinput import sys import re pat = re.compile(r"((?:=\|:\|^)\sload (?:atomic )?(?:volatile )?(.?))(\| addrspace$\d+$ )\($\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$)") for line in sys.stdin: sys.stdout.write(re.sub(pat, r"\1, \2\3*\4", line)) Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7649 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230794 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-27 21:17:42 +00:00
David Blaikie	198d8baafb	[opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction One of several parallel first steps to remove the target type of pointers, replacing them with a single opaque pointer type. This adds an explicit type parameter to the gep instruction so that when the first parameter becomes an opaque pointer type, the type to gep through is still available to the instructions. * This doesn't modify gep operators, only instructions (operators will be handled separately) * Textual IR changes only. Bitcode (including upgrade) and changing the in-memory representation will be in separate changes. * geps of vectors are transformed as: getelementptr <4 x float> %x, ... ->getelementptr float, <4 x float> %x, ... Then, once the opaque pointer type is introduced, this will ultimately look like: getelementptr float, <4 x ptr> %x with the unambiguous interpretation that it is a vector of pointers to float. * address spaces remain on the pointer, not the type: getelementptr float addrspace(1)* %x ->getelementptr float, float addrspace(1)* %x Then, eventually: getelementptr float, ptr addrspace(1) %x Importantly, the massive amount of test case churn has been automated by same crappy python code. I had to manually update a few test cases that wouldn't fit the script's model (r228970,r229196,r229197,r229198). The python script just massages stdin and writes the result to stdout, I then wrapped that in a shell script to handle replacing files, then using the usual find+xargs to migrate all the files. update.py: import fileinput import sys import re ibrep = re.compile(r"(^.?[^%\w]getelementptr inbounds )(((?:<\d x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") normrep = re.compile( r"(^.?[^%\w]getelementptr )(((?:<\d* x )?)(.?)(\| addrspace$\d$) \(\|>)(?:$\| (?:%\|@\|null\|undef\|blockaddress\|getelementptr\|addrspacecast\|bitcast\|inttoptr\|\[\[[a-zA-Z]\|\{\{).$))") def conv(match, line): if not match: return line line = match.groups()[0] if len(match.groups()[5]) == 0: line += match.groups()[2] line += match.groups()[3] line += ", " line += match.groups()[1] line += "\n" return line for line in sys.stdin: if line.find("getelementptr ") == line.find("getelementptr inbounds"): if line.find("getelementptr inbounds") != line.find("getelementptr inbounds ("): line = conv(re.match(ibrep, line), line) elif line.find("getelementptr ") != line.find("getelementptr ("): line = conv(re.match(normrep, line), line) sys.stdout.write(line) apply.sh: for name in "$@" do python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name" rm -f "$name.tmp" done The actual commands: From llvm/src: find test/ -name .ll \| xargs ./apply.sh From llvm/src/tools/clang: find test/ -name .mm -o -name .m -o -name .cpp -o -name .c \| xargs -I '{}' ../../apply.sh "{}" From llvm/src/tools/polly: find test/ -name *.ll \| xargs ./apply.sh After that, check-all (with llvm, clang, clang-tools-extra, lld, compiler-rt, and polly all checked out). The extra 'rm' in the apply.sh script is due to a few files in clang's test suite using interesting unicode stuff that my python script was throwing exceptions on. None of those files needed to be migrated, so it seemed sufficient to ignore those cases. Reviewers: rafael, dexonsmith, grosser Differential Revision: http://reviews.llvm.org/D7636 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@230786 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-27 19:29:02 +00:00
Jingyue Wu	f15b696b79	[NVPTX] Emit .pragma "nounroll" for loops marked with nounroll Summary: CUDA driver can unroll loops when jit-compiling PTX. To prevent CUDA driver from unrolling a loop marked with llvm.loop.unroll.disable is not unrolled by CUDA driver, we need to emit .pragma "nounroll" at the header of that loop. This patch also extracts getting unroll metadata from loop ID metadata into a shared helper function. Test Plan: test/CodeGen/NVPTX/nounroll.ll Reviewers: eliben, meheff, jholewinski Reviewed By: jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D7041 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227703 91177308-0d34-0410-b5e6-96231b3b80d8	2015-02-01 02:27:45 +00:00
Justin Holewinski	33eac3ee53	[NVPTX] Generate a more optimal sequence for select of i1 Instead of creating a pattern like "(p && a) \|\| ((!p) && b)", just expand the i8 operands to i32 and perform the selp on them. Fixes PR22246 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227123 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:52:20 +00:00
Justin Holewinski	48e872230d	[NVPTX] Handle floating-point conversion patterns that are not explicitly ordered or unordered Fixes PR22322 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-26 19:11:20 +00:00
Olivier Sallenave	735aa71398	Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225987 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 15:36:28 +00:00
Jingyue Wu	9830b459be	[NVPTX] Fix bugs related to isSingleValueType Summary: With isSingleValueType starting to treat vector types as single-value types, code that uses this interface needs to be updated. Test Plan: vector-global.ll nvcl-param-align.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D6573 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224440 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-17 17:59:04 +00:00
Duncan P. N. Exon Smith	1ef70ff39b	IR: Make metadata typeless in assembly Now that `Metadata` is typeless, reflect that in the assembly. These are the matching assembly changes for the metadata/value split in r223802. - Only use the `metadata` type when referencing metadata from a call intrinsic -- i.e., only when it's used as a `Value`. - Stop pretending that `ValueAsMetadata` is wrapped in an `MDNode` when referencing it from call intrinsics. So, assembly like this: define @foo(i32 %v) { call void @llvm.foo(metadata !{i32 %v}, metadata !0) call void @llvm.foo(metadata !{i32 7}, metadata !0) call void @llvm.foo(metadata !1, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{metadata !3}, metadata !0) ret void, !bar !2 } !0 = metadata !{metadata !2} !1 = metadata !{i32* @global} !2 = metadata !{metadata !3} !3 = metadata !{} turns into this: define @foo(i32 %v) { call void @llvm.foo(metadata i32 %v, metadata !0) call void @llvm.foo(metadata i32 7, metadata !0) call void @llvm.foo(metadata i32* @global, metadata !0) call void @llvm.foo(metadata !3, metadata !0) call void @llvm.foo(metadata !{!3}, metadata !0) ret void, !bar !2 } !0 = !{!2} !1 = !{i32* @global} !2 = !{!3} !3 = !{} I wrote an upgrade script that handled almost all of the tests in llvm and many of the tests in cfe (even handling many `CHECK` lines). I've attached it (or will attach it in a moment if you're speedy) to PR21532 to help everyone update their out-of-tree testcases. This is part of PR21532. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224257 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-15 19:07:53 +00:00
Duncan P. N. Exon Smith	8f616165e4	IR: Canonicalize metadata formatting, NFC Canonicalize formatting of metadata to make it easier to upgrade via scripts -- in particular, one line per metadata definition makes it more `sed`-able. This is preparation for changing the assembly syntax for metadata [1]. [1]: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20141208/248449.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224002 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-11 06:32:29 +00:00
Jingyue Wu	b043278834	[NVPTX] Do not emit .weak symbols for NVPTX Summary: ".weak" symbols cannot be consumed by ptxas (PR21685). This patch makes the weak directive in MCAsmPrinter customizable, and disables emitting ".weak" symbols for NVPTX. Test Plan: weak-linkage.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: majnemer, jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D6455 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223077 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-01 21:16:17 +00:00
Justin Holewinski	e459c0bf65	[NVPTX] Add NVPTXLowerStructArgs pass This works around the limitation that PTX does not allow .param space loads/stores with arbitrary pointers. If a function has a by-val struct ptr arg, say foo(%struct.x byval %d), then add the following instructions to the first basic block : %temp = alloca %struct.x, align 8 %tt1 = bitcast %struct.x %d to i8 * %tt2 = llvm.nvvm.cvt.gen.to.param %tt2 %tempd = bitcast i8 addrspace(101) * to %struct.x addrspace(101) * %tv = load %struct.x addrspace(101) * %tempd store %struct.x %tv, %struct.x * %temp, align 8 The above code allocates some space in the stack and copies the incoming struct from param space to local space. Then replace all occurences of %d by %temp. Fixes PR21465. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-05 18:19:30 +00:00
Jingyue Wu	1d1d705a95	[NVPTX] aligned byte-buffers for vector return types Summary: Fixes PR21100 which is caused by inconsistency between the declared return type and the expected return type at the call site. The new behavior is consistent with nvcc and the NVPTXTargetLowering::getPrototype function. Test Plan: test/Codegen/NVPTX/vector-return.ll Reviewers: jholewinski Reviewed By: jholewinski Subscribers: llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5612 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220607 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-25 03:46:16 +00:00
Jingyue Wu	75d77cd179	[MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-15 03:27:43 +00:00
Jingyue Wu	ccd995ab0c	Revert r216862 due to a performance regression Reported by Alexey Volkov in PR21115 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218771 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-01 15:22:13 +00:00
Jingyue Wu	d43e6df10b	[MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. Test Plan: Added a NVPTX codegen test to verify that our change is in effect. It also shows the unnecessary register pressure caused by over-sinking. Updated affected tests in AArch64 and X86. Reviewers: eliben, meheff, Jiangning Reviewed By: Jiangning Subscribers: jholewinski, aemerson, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-01 03:47:25 +00:00
Jingyue Wu	87a2b36cf6	[NVPTX] Make the alignment an explicit argument to ldu/ldg Summary: Instead of specifying the alignment as metadata which may be destroyed by transformation passes, make the alignment the second argument to ldu/ldg intrinsic calls. Test Plan: ldu-ldg.ll ldu-i8.ll ldu-reg-plus-offset.ll Reviewers: eliben, meheff, jholewinski Reviewed By: meheff, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D5093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216731 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 15:30:20 +00:00
Justin Holewinski	2941802bad	[NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213794 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 20:23:49 +00:00
Justin Holewinski	03f160f9d3	[NVPTX] mul.wide generation works for any smaller integer source types, not just the next smaller power of two git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213784 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 18:46:03 +00:00
Justin Holewinski	6b50a7d179	[NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled With optimizations disabled, we disable the isel patterns for mul.wide; but we were still generating MULWIDE ISD nodes. Now, we only try to generate MULWIDE ISD nodes in DAGCombine if the optimization level is not zero. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-23 17:40:45 +00:00
Eli Bendersky	07efe87198	Add some tests for NVPTX lowering of cmpxchg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213586 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-21 22:54:44 +00:00
Eli Bendersky	bd9c6c0773	Add tests for atomic adds on floats. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213406 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 20:11:26 +00:00
Eli Bendersky	6a33571ba9	Use CHECK-LABEL where appropriate in this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213398 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 19:32:09 +00:00
Tim Northover	b41b1d4bac	NVPTX: support fpext/fptrunc to and from f16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 13:01:43 +00:00
Tim Northover	0afed03229	CodeGen: soften f16 type by default instead of marking legal. Actual support for softening f16 operations is still limited, and can be added when it's needed. But Soften is much closer to being a useful thing to try than keeping it Legal when no registers can actually hold such values. Longer term, we probably want something between Soften and Promote semantics for most targets, it'll be more efficient to promote the 4 basic operations to f32 than libcall them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213372 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 12:41:46 +00:00
Tim Northover	7bbf5786d7	NVPTX: support direct f16 <-> f64 conversions via intrinsics. Clang may well start emitting these soon, and while it may not be directly relevant for OpenCL or GLSL, the instructions were just sitting there waiting to be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213356 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-18 08:30:10 +00:00
Justin Holewinski	11ae250ec9	[NVPTX] Improve handling of FP fusion We now consider the FPOpFusion flag when determining whether to fuse ops. We also explicitly emit add.rn when fusion is disabled to prevent ptxas from fusing the operations on its own. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213287 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 18:10:09 +00:00
Justin Holewinski	07bc0b6ae6	[NVPTX] Add missing .v4 qualifier on vector store instruction git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213276 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 16:58:56 +00:00
Justin Holewinski	b26ede07ad	[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery Also, add some tests to make sure we can handle surface/texture queries on both Fermi and Kepler+. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213268 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 14:51:33 +00:00
Justin Holewinski	d6663f565c	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch This also uses TSFlags to mark machine instructions that are surface/texture accesses, as well as the vector width for surface operations. This is used to simplify some of the switch statements that need to detect surface/texture instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-17 11:59:04 +00:00
Justin Holewinski	a1535e3b9b	[NVPTX] Honor alignment on vector loads/stores We were not considering the stated alignment on vector loads/stores, leading us to generate vector instructions even when we do not have sufficient alignment. Now, for IR like: %1 = load <4 x float>, <4 x float>* %ptr, align 4 we will generate correct, conservative PTX like: ld.f32 ... [%ptr] ld.f32 ... [%ptr+4] ld.f32 ... [%ptr+8] ld.f32 ... [%ptr+12] Or if we have an alignment of 8 (for example), we can generate code like: ld.v2.f32 ... [%ptr] ld.v2.f32 ... [%ptr+8] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213186 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 19:45:35 +00:00
Justin Holewinski	7e6565112b	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd This matches the internal behavior of NVIDIA tools like libnvvm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213168 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-16 16:26:58 +00:00
Justin Holewinski	7a28de08f3	[NVPTX] Add reflect intrinsic (better than matching by function name) Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211948 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:36:11 +00:00
Justin Holewinski	c95d327874	[NVPTX] Add 'b' asm constraint git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211946 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:36:06 +00:00
Justin Holewinski	3c81367a5d	[NVPTX] Error out if initializer is given for variable in an address space that does not support initialization git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211943 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:36:01 +00:00
Justin Holewinski	0ded57ccc5	[NVPTX] Add support for .managed variables for UVM git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211942 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:58 +00:00
Justin Holewinski	2a8dc35cca	[NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211941 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:56 +00:00
Justin Holewinski	cb8f98382b	[NVPTX] Fix handling of ldg/ldu intrinsics. The address space of the pointer must be global (1) for these intrinsics. There must also be alignment metadata attached to the intrinsic calls, e.g. %val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0 !0 = metadata !{i32 4} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211939 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:51 +00:00
Justin Holewinski	8992274412	[NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211938 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:44 +00:00
Justin Holewinski	863b0d45a5	[NVPTX] Add support for [SHL,SRA,SRL]_PARTS git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211936 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:40 +00:00
Justin Holewinski	10da1651ed	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211935 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 18:35:37 +00:00

1 2 3

128 Commits