Commit Graph

286 Commits

Author SHA1 Message Date
Justin Holewinski
ce64168930 [NVPTX] Silence a GCC warning found by the buildbots
The cast to NVPTXTargetLowering was missing a 'const', but let's
just access the right pointer through the subtarget anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213793 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 20:23:47 +00:00
Justin Holewinski
03f160f9d3 [NVPTX] mul.wide generation works for any smaller integer source types, not just the next smaller power of two
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 18:46:03 +00:00
Justin Holewinski
6b50a7d179 [NVPTX] Make sure we do not generate MULWIDE ISD nodes when optimizations are disabled
With optimizations disabled, we disable the isel patterns for mul.wide; but we
were still generating MULWIDE ISD nodes.  Now, we only try to generate MULWIDE
ISD nodes in DAGCombine if the optimization level is not zero.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213773 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-23 17:40:45 +00:00
Tim Northover
b41b1d4bac NVPTX: support fpext/fptrunc to and from f16.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213377 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 13:01:43 +00:00
Tim Northover
7bbf5786d7 NVPTX: support direct f16 <-> f64 conversions via intrinsics.
Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213356 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-18 08:30:10 +00:00
Justin Holewinski
11ae250ec9 [NVPTX] Improve handling of FP fusion
We now consider the FPOpFusion flag when determining whether
to fuse ops.  We also explicitly emit add.rn when fusion is
disabled to prevent ptxas from fusing the operations on its
own.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213287 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 18:10:09 +00:00
Justin Holewinski
07bc0b6ae6 [NVPTX] Add missing .v4 qualifier on vector store instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213276 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 16:58:56 +00:00
Justin Holewinski
b26ede07ad [NVPTX] Flag surface/texture query instructions with IsTexSurfQuery
Also, add some tests to make sure we can handle surface/texture
queries on both Fermi and Kepler+.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213268 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 14:51:33 +00:00
Justin Holewinski
d6663f565c [NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch
This also uses TSFlags to mark machine instructions that are surface/texture
accesses, as well as the vector width for surface operations.  This is used
to simplify some of the switch statements that need to detect surface/texture
instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 11:59:04 +00:00
Tim Northover
3e61ccdded CodeGen: extend f16 conversions to permit types > float.
This makes the two intrinsics @llvm.convert.from.f16 and
@llvm.convert.to.f16 accept types other than simple "float". This is
only strictly needed for the truncate operation, since otherwise
double rounding occurs and there's no way to represent the strict IEEE
conversion. However, for symmetry we allow larger types in the extend
too.

During legalization, we can expand an "fp16_to_double" operation into
two extends for convenience, but abort when the truncate isn't legal. A new
libcall is probably needed here.

Even after this commit, various target tweaks are needed to actually use the
extended intrinsics. I've put these into separate commits for clarity, so there
are no actual tests of f64 conversion here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-17 10:51:23 +00:00
Justin Holewinski
a1535e3b9b [NVPTX] Honor alignment on vector loads/stores
We were not considering the stated alignment on vector loads/stores,
leading us to generate vector instructions even when we do not have
sufficient alignment.

Now, for IR like:

  %1 = load <4 x float>, <4 x float>* %ptr, align 4

we will generate correct, conservative PTX like:

  ld.f32 ... [%ptr]
  ld.f32 ... [%ptr+4]
  ld.f32 ... [%ptr+8]
  ld.f32 ... [%ptr+12]

Or if we have an alignment of 8 (for example), we can
generate code like:

  ld.v2.f32 ... [%ptr]
  ld.v2.f32 ... [%ptr+8]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-16 19:45:35 +00:00
Justin Holewinski
7e6565112b [NVPTX] Rename registers %fl -> %fd and %rl -> %rd
This matches the internal behavior of NVIDIA tools like libnvvm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-16 16:26:58 +00:00
David Majnemer
38d8be1ad8 CodeGen: Stick constant pool entries in COMDAT sections for WinCOFF
COFF lacks a feature that other object file formats support: mergeable
sections.

To work around this, MSVC sticks constant pool entries in special COMDAT
sections so that each constant is in it's own section.  This permits
unused constants to be dropped and it also allows duplicate constants in
different translation units to get merged together.

This fixes PR20262.

Differential Revision: http://reviews.llvm.org/D4482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-14 22:57:27 +00:00
NAKAMURA Takumi
95d3cfc51d NVPTX/LLVMBuild.txt: Add "Scalar" to required_libraries. It is really referenced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212918 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-14 02:52:19 +00:00
Chandler Carruth
70968365db [codegen,aarch64] Add a target hook to the code generator to control
vector type legalization strategies in a more fine grained manner, and
change the legalization of several v1iN types and v1f32 to be widening
rather than scalarization on AArch64.

This fixes an assertion failure caused by scalarizing nodes like "v1i32
trunc v1i64". As v1i64 is legal it will fail to scalarize v1i32.

This also provides a foundation for other targets to have more granular
control over how vector types are legalized.

Patch by Hao Liu, reviewed by Tim Northover. I'm committing it to allow
some work to start taking place on top of this patch as it adds some
really important hooks to the backend that I'd like to immediately start
using. =]

http://reviews.llvm.org/D4322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212242 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-03 00:23:43 +00:00
Justin Holewinski
8474c35bd8 [NVPTX] Use GreatestCommonDivisor64 from MathExtras instead of using our own. Thanks Hal!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211952 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 19:36:25 +00:00
Justin Holewinski
7a28de08f3 [NVPTX] Add reflect intrinsic (better than matching by function name)
Also clean up some of the logic in NVVMReflect.cpp while we're messing around in there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211948 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:11 +00:00
Justin Holewinski
9832f7dc71 [NVPTX] Handle all possible vector types in getSetCCResultType, not just the ones representable as MVTs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211947 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:08 +00:00
Justin Holewinski
c95d327874 [NVPTX] Add 'b' asm constraint
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211946 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:06 +00:00
Justin Holewinski
9272c2b67e [NVPTX] Simplify some argument lowering logic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211945 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:04 +00:00
Justin Holewinski
127f0e810b [NVPTX] Do not process samplers in GenericToNVVM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:02 +00:00
Justin Holewinski
3c81367a5d [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:36:01 +00:00
Justin Holewinski
0ded57ccc5 [NVPTX] Add support for .managed variables for UVM
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211942 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:58 +00:00
Justin Holewinski
2a8dc35cca [NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211941 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:56 +00:00
Justin Holewinski
ab7c0aa662 [NVPTX] Variables that start with llvm. or nvvm. are reserved and should not be emitted
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211940 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:53 +00:00
Justin Holewinski
cb8f98382b [NVPTX] Fix handling of ldg/ldu intrinsics.
The address space of the pointer must be global (1) for these intrinsics.  There must also be alignment metadata attached to the intrinsic calls, e.g.

%val = tail call i32 @llvm.nvvm.ldu.i.global.i32.p1i32(i32 addrspace(1)* %ptr), !align !0

!0 = metadata !{i32 4}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211939 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:51 +00:00
Justin Holewinski
8992274412 [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211938 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:44 +00:00
Justin Holewinski
3fb44103eb [NVPTX] Add missing boolean vector contents flag
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211937 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:42 +00:00
Justin Holewinski
863b0d45a5 [NVPTX] Add support for [SHL,SRA,SRL]_PARTS
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211936 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:40 +00:00
Justin Holewinski
10da1651ed [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns
This also introduces DAGCombiner patterns for mul.wide to multiply two smaller integers and produce a larger integer

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211935 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:37 +00:00
Justin Holewinski
508c80f11f [NVPTX] Add support for efficient rotate instructions on SM 3.2+
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:33 +00:00
Justin Holewinski
1f75f4a0ee [NVPTX] Add missing isel patterns for 64-bit atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211933 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:30 +00:00
Justin Holewinski
ef92cf50d6 [NVPTX] Add isel patterns for bit-field extract (bfe)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211932 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:27 +00:00
Justin Holewinski
de7bbdff33 [NVPTX] Add support for isspacep instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211931 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:24 +00:00
Justin Holewinski
1571d272c8 [NVPTX] Add support for envreg reads
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211930 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:21 +00:00
Justin Holewinski
305dda4fc7 [NVPTX] Add target options for PTX 3.2/4.0 and SM 5.0 (Maxwell)
Default PTX version is set to PTX 3.2

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211929 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:18 +00:00
Justin Holewinski
aac29c0c22 [NVPTX] Update sub-target feature detection
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:16 +00:00
Justin Holewinski
7d7f3e3923 [NVPTX] Directly control the Machine SSA passes that are invoked for NVPTX.
NVPTX is a bit special in the optimizations it requires, so this gives
us better control over the backend optimization pipeline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:14 +00:00
Justin Holewinski
a54609ed93 [NVPTX] Emit .weak when linkage is not external, internal, or private
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:10 +00:00
Justin Holewinski
d51ee46dc5 [NVPTX] Just use getTypeAllocSize() when computing return value size for structures and vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211925 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 18:35:08 +00:00
Eric Christopher
493512898f Move NVPTX subtarget dependent variables from the target machine
to the subtarget.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 04:33:14 +00:00
Eric Christopher
ed4589dc16 Use the target lowering we can get off of the DAG rather than off
of the cached target machine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211858 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 03:45:49 +00:00
Eric Christopher
84d545dd34 Move the constructor for NVPTXFrameLowering into the implementation
file in preparation for the subtarget move.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 02:05:24 +00:00
Eric Christopher
9456c7b20a Remove unnecessary caching of the TargetMachine on NVPTXFrameLowering.
Adjust the constructor accordingly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 02:05:22 +00:00
Eric Christopher
04c4efc593 Rework the logic for setting the TargetName. This appears to
be shorter and identical in goal.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211845 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 02:05:19 +00:00
Eric Christopher
82cb24a385 Remove caching of the target machine in NVPTXInstrInfo and
update constructor accordingly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211840 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 01:27:08 +00:00
Eric Christopher
6c57c3336c Remove comment that duplicated information in the constructor
that it's after.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211839 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 01:27:06 +00:00
Eric Christopher
ac736351f0 Remove commented out code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211838 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 01:27:05 +00:00
Eric Christopher
e6b542dd2e Remove extraneous parens and extraneous const cast (and fix the
prototype for the function to patch what we were returning).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-27 01:27:03 +00:00
Alp Toker
8dd8d5c2b2 Revert "Introduce a string_ostream string builder facilty"
Temporarily back out commits r211749, r211752 and r211754.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211814 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-26 22:52:05 +00:00