Commit Graph

2566 Commits

Author SHA1 Message Date
Filipe Cabecinhas
b19c087aa7 Lower certain build_vectors to insertps instructions
Summary:
Vectors built with zeros and elements in the same order as another
(source) vector are optimized to be built using a single insertps
instruction.
Also optimize when we move one element in a vector to a different place
in that vector while zeroing out some of the other elements.

Further optimizations are possible, described in TODO comments.
I will be implementing at least some of them in the near future.

Added some tests for different cases where this optimization triggers.

Reviewers: nadav, delena, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3521

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208271 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-08 00:25:16 +00:00
Andrea Di Biagio
8a712ba229 [X86] Improve the lowering of BITCAST dag nodes from type f64 to type v2i32 (and vice versa).
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.

With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).

This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208107 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 17:09:03 +00:00
Renato Golin
22f779d1fd Implememting named register intrinsics
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).

So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208104 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 16:51:25 +00:00
Reid Kleckner
9ad48c11b1 Fix i128 div/mod on mingw64
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.

Patch by Jameson Nash!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208029 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-06 01:20:42 +00:00
Filipe Cabecinhas
75ea413a1b Revert "Optimize shufflevector that copies an i64/f64 and zeros the rest."
This reverts commit 207992. I misread the phab number on the LGTM.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207993 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 19:40:36 +00:00
Filipe Cabecinhas
a0fa9eb606 Optimize shufflevector that copies an i64/f64 and zeros the rest.
Summary:
Also ran clang-format on the function. The code added is the last else
if block.

Reviewers: nadav, craig.topper

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3518

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207992 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-05 19:36:28 +00:00
Craig Topper
7ae9b5fc71 Use makeArrayRef insted of calling ArrayRef<T> constructor directly. I introduced most of these recently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207616 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-30 07:17:30 +00:00
Reid Kleckner
9902128e2a Implement X86 code generation for musttail
Currently, musttail codegen is relying on sibcall optimization, and
reporting a fatal error if fails.  Sibcall optimization fails when stack
arguments need to be modified, which is insufficient for musttail.

The logic for moving arguments in memory safely is already implemented
for GuaranteedTailCallOpt.  This change merely arranges for musttail
calls to use it.

No functional change for GuaranteedTailCallOpt.

Reviewers: espindola

Differential Revision: http://reviews.llvm.org/D3493

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207598 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-29 23:55:41 +00:00
Elena Demikhovsky
e3e08acd09 AVX-512: optimized a shuffle pattern to VINSERTI64x4.
Added intrinsics for VPERMT2PS/PD/D/Q instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207513 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-29 09:09:15 +00:00
Quentin Colombet
aec1f2c2f5 [X86] Add more details in the comments of X86TargetLowering::getScalingFactorCost.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207432 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-28 18:39:57 +00:00
Craig Topper
c34a25d59d [C++] Use 'nullptr'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207394 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-28 04:05:08 +00:00
Craig Topper
7e1ae6d9e0 Convert one last signature of getNode to take an ArrayRef of SDUse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207376 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-27 19:21:06 +00:00
Craig Topper
a7f892b33b Convert SelectionDAG::getMergeValues to use ArrayRef.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207374 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-27 19:20:57 +00:00
Benjamin Kramer
ad1f916eaf X86: If SSE4.1 is missing lower SMUL_LOHI of v4i32 to pmuludq and fix up the high parts.
This is more expensive than pmuldq but still cheaper than scalarizing the whole thing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207370 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-27 18:47:41 +00:00
Craig Topper
72c93595de Convert getMemIntrinsicNode to take ArrayRef of SDValue instead of pointer and size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207329 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 19:29:41 +00:00
Craig Topper
80d8db7a1f Convert SelectionDAG::getNode methods to use ArrayRef<SDValue>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207327 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 18:35:24 +00:00
Benjamin Kramer
1ecbfb403c Print X86ISD::PMULDQ nodes properly in debug output.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207322 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 16:26:41 +00:00
Benjamin Kramer
9f2c21871c X86: Lower SMUL_LOHI of v4i32 to pmuldq when SSE4.1 is available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207318 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 14:12:19 +00:00
Benjamin Kramer
fb625eadf9 X86: Add patterns for MULHU/MULHS of v8i16 and v16i16.
This gets us pretty code for divs of i16 vectors. Turn the existing
intrinsics into the corresponding nodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207317 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 13:01:03 +00:00
Benjamin Kramer
75125c127d Rip out X86-specific vector SDIV lowering, make the corresponding DAGCombiner transform work on vectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207316 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 13:00:53 +00:00
Benjamin Kramer
05e00b6e65 X86: Custom lower v4i32 UMUL_LOHI into 2 pmuludqs.
Test will follow soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207314 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 12:06:11 +00:00
Quentin Colombet
9e93e47b7f [X86] Implement TargetLowering::getScalingFactorCost hook.
Scaling factors are not free on X86 because every "complex" addressing mode
breaks the related instruction into 2 allocations instead of 1.

<rdar://problem/16730541>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207301 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-26 01:11:26 +00:00
Filipe Cabecinhas
3c02165172 Optimization for certain shufflevector by using insertps.
Summary:
If we're doing a v4f32/v4i32 shuffle on x86 with SSE4.1, we can lower
certain shufflevectors to an insertps instruction:
When most of the shufflevector result's elements come from one vector (and
keep their index), and one element comes from another vector or a memory
operand.

Added tests for insertps optimizations on shufflevector.
Added support and tests for v4i32 vector optimization.

Reviewers: nadav

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D3475

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207291 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-25 23:51:17 +00:00
Craig Topper
c848b1bbcf [C++] Use 'nullptr'. Target edition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207197 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-25 05:30:21 +00:00
Benjamin Kramer
fda5e19b96 X86: Don't transform shifts into ands when the sign bit is tested.
Should unbreak MultiSource/Benchmarks/mediabench/g721/g721encode/encode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207145 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 20:51:37 +00:00
Reid Kleckner
710c1a449d Add 'musttail' marker to call instructions
This is similar to the 'tail' marker, except that it guarantees that
tail call optimization will occur.  It also comes with convervative IR
verification rules that ensure that tail call optimization is possible.

Reviewers: nicholas

Differential Revision: http://llvm-reviews.chandlerc.com/D3240

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207143 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 20:14:34 +00:00
Andrea Di Biagio
35f9e1aa49 [X86] Add support for Read Time Stamp Counter x86 builtin intrinsics.
This patch:
- Adds two new X86 builtin intrinsics ('int_x86_rdtsc' and
   'int_x86_rdtscp') as GCCBuiltin intrinsics;
- Teaches the backend how to lower the two new builtins;
- Introduces a common function to lower READCYCLECOUNTER dag nodes
  and the two new rdtsc/rdtscp intrinsics;
- Improves (and extends) the existing x86 test 'rdtsc.ll'; now test 'rdtsc.ll'
  correctly verifies that both READCYCLECOUNTER and the two new intrinsics
  work fine for both 64bit and 32bit Subtargets.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207127 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 17:18:27 +00:00
Benjamin Kramer
f43438b6c3 X86: Emit test instead of constant shift + compare if the shift result is unused.
This allows us to compile
  return (mask & 0x8 ? a : b);
into
  testb $8, %dil
  cmovnel %edx, %esi
instead of
  andl  $8, %edi
  shrl  $3, %edi
  cmovnel %edx, %esi

which we formed previously because dag combiner canonicalizes setcc of and into shift.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207088 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-24 08:15:31 +00:00
Elena Demikhovsky
b84cc10c3c AVX-512: store and truncstore for i1 values
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 14:13:10 +00:00
Lang Hames
390592d968 [X86] Use tablegen instead of DAG combines to match BZHI instructions, as
suggested by Ben Kramer in review of r206738.

Thanks again Ben!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 10:41:56 +00:00
Lang Hames
53b4d83b63 [X86] Don't use BZHI for short masks (>=32 bits). Thanks to Ben Kramer for the
review.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206869 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 07:40:34 +00:00
Chandler Carruth
42e8630239 [Modules] Fix potential ODR violations by sinking the DEBUG_TYPE
definition below all of the header #include lines, lib/Target/...
edition.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206842 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-22 02:41:26 +00:00
Lang Hames
f69bb5e43c [X86] ISEL (and X, <constant mask>) to BZHI when BMI2 is available.
Generating BZHI in the variable mask case, i.e. (and X, (sub (shl 1, N), 1)),
was already supported, but we were missing the constant-mask case. This patch
fixes that.

<rdar://problem/15480077>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206738 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-21 08:18:53 +00:00
Adam Nemet
d290fa608f [X86] Improve buildFromShuffleMostly for AVX
For a 256-bit BUILD_VECTOR consisting mostly of shuffles of 256-bit vectors,
both the BUILD_VECTOR and its operands may need to be legalized in multiple
steps.  Consider:

(v8f32 (BUILD_VECTOR (extract_vector_elt (v8f32 %vreg0,) Constant<1>),
                     (extract_vector_elt %vreg0, Constant<2>),
                     (extract_vector_elt %vreg0, Constant<3>),
                     (extract_vector_elt %vreg0, Constant<4>),
                     (extract_vector_elt %vreg0, Constant<5>),
                     (extract_vector_elt %vreg0, Constant<6>),
                     (extract_vector_elt %vreg0, Constant<7>),
                     %vreg1))

a. We can't build a 256-bit vector efficiently so, we need to split it into
two 128-bit vecs and combine them with VINSERTX128.

b. Operands like (extract_vector_elt (v8f32 %vreg0), Constant<7>) needs to be
split into a VEXTRACTX128 and a further extract_vector_elt from the
resulting 128-bit vector.

c. The extract_vector_elt from b. is lowered into a shuffle to the first
element and a movss.

Depending on the order in which we legalize the BUILD_VECTOR and its
operands[1], buildFromShuffleMostly may be faced with:

(v4f32 (BUILD_VECTOR (extract_vector_elt
                      (vector_shuffle<1,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     (extract_vector_elt
                      (vector_shuffle<2,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     (extract_vector_elt
                      (vector_shuffle<3,u,u,u> (extract_subvector %vreg0, Constant<4>), undef),
                      Constant<0>),
                     %vreg1))

In order to figure out the underlying vector and their identity we need to see
through the shuffles.

[1] Note that the order in which operations and their operands are legalized is
only guaranteed in the first iteration of LegalizeDAG.

Fixes <rdar://problem/16296956>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-18 19:44:16 +00:00
Andrea Di Biagio
749e8fee34 [X86] Improve the lowering of packed shifts by constant build_vector.
This patch teaches the backend how to efficiently lower logical and
arithmetic packed shifts on both SSE and AVX/AVX2 machines.

When possible, instead of scalarizing a vector shift, the backend should try
to expand the shift into a sequence of two packed shifts by immedate count
followed by a MOVSS/MOVSD.

Example
  (v4i32 (srl A, (build_vector < X, Y, Y, Y>)))

Can be rewritten as:
  (v4i32 (MOVSS (srl A, <Y,Y,Y,Y>), (srl A, <X,X,X,X>)))

[with X and Y ConstantInt]

The advantage is that the two new shifts from the example would be lowered into
X86ISD::VSRLI nodes. This is always cheaper than scalarizing the vector into
four scalar shifts plus four pairs of vector insert/extract.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206316 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 19:30:48 +00:00
Nick Lewycky
d63390cba1 Break PseudoSourceValue out of the Value hierarchy. It is now the root of its own tree containing FixedStackPseudoSourceValue (which you can use isa/dyn_cast on) and MipsCallEntry (which you can't). Anything that needs to use either a PseudoSourceValue* and Value* is strongly encouraged to use a MachinePointerInfo instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-15 07:22:52 +00:00
David Blaikie
b85c7e569a Change argument order and add explanatory comment to r206130
Changes requested in code review by Eric Christopher of r206130.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206219 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-14 22:23:06 +00:00
David Blaikie
f85e6da6d0 Fix instruction debug info location during legalization
I found this from a particular GDB test suite case of inlining
(something similar is provided as a test case) but came across a few
other related cases (other callers of the same functions, and one other
instance of the same coding mistake in a separate function).

I'm not sure what the best way to test this is (let alone to cover the
other cases I discovered), so hopefully this sufficies - open to ideas.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206130 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-13 06:39:55 +00:00
Reid Kleckner
bc1fd917f0 Move the segmented stack switch to a function attribute
This removes the -segmented-stacks command line flag in favor of a
per-function "split-stack" attribute.

Patch by Luqman Aden and Alex Crichton!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-10 22:58:43 +00:00
Elena Demikhovsky
0d5d656524 AVX-512: insert element to mask vector; store i1 data
Implemented INSERT_VECTOR_ELT operation for v16i1 and v8i1 vectors;
Implemented "store" for i1 type


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205850 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-09 12:37:50 +00:00
Elena Demikhovsky
dbcb670605 AVX-512: Added fp_to_uint and uint_to_fp patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205754 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-08 07:24:02 +00:00
Matt Arsenault
a134cec06b Add DAG parameter to ComputeNumSignBitsForTargetNode
This way, you can check the number of sign bits in the
operands. The depth parameter it already has is pretty useless
without this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205649 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-04 20:13:13 +00:00
Craig Topper
84f7f350c3 Make consistent use of MCPhysReg instead of uint16_t throughout the tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205610 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-04 05:16:06 +00:00
Yaron Keren
9ee14e3522 Added isTargetWindowsMSVC(), renamed isTargetMingw() to isTargetWindowsGNU()
and isTargetCygwin() to isTargetWindowsCygwin() to be consistent with the
four Windows environments in Triple.h.

Suggestion by Saleem Abdulrasool!



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205393 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-02 04:27:51 +00:00
Yaron Keren
f2dc47ce99 If isKnownWindowsMSVCEnvironment then getOS == Triple::Win32 and
Environment == Triple::MSVC so it will never be MinGW or Cygwin.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 18:52:55 +00:00
Yaron Keren
bec4ff5781 isTargetWindows() renamed to isTargetKnownWindowsMSVC()
to reflect its current functionality.

Based on Takumi NAKAMURA suggestion.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205338 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 18:15:34 +00:00
Aaron Ballman
77d76519dc Attempting to fix r205124, which had failed asserts when built with MSVC.
Suggestion from Yaron Keren.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205313 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 13:56:35 +00:00
Alexey Volkov
1d75829ddc [x86] Do not convert to cmp32 for Atom arch by Sergey Okunev
Differential Revision: http://llvm-reviews.chandlerc.com/D2824


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@205288 91177308-0d34-0410-b5e6-96231b3b80d8
2014-04-01 08:13:07 +00:00
Rafael Espindola
f165cf7ce8 Prevent alias from pointing to weak aliases.
This adds back r204781.

Original message:

Aliases are just another name for a position in a file. As such, the
regular symbol resolutions are not applied. For example, given

define void @my_func() {
  ret void
}
@my_alias = alias weak void ()* @my_func
@my_alias2 = alias void ()* @my_alias

We produce without this patch:

        .weak   my_alias
my_alias = my_func
        .globl  my_alias2
my_alias2 = my_alias

That is, in the resulting ELF file my_alias, my_func and my_alias are
just 3 names pointing to offset 0 of .text. That is *not* the
semantics of IR linking. For example, linking in a

@my_alias = alias void ()* @other_func

would require the strong my_alias to override the weak one and
my_alias2 would end up pointing to other_func.

There is no way to represent that with aliases being just another
name, so the best solution seems to be to just disallow it, converting
a miscompile into an error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-27 15:26:56 +00:00
Rafael Espindola
72db10a995 Revert "Prevent alias from pointing to weak aliases."
This reverts commit r204781.

I will follow up to with msan folks to see what is what they
were trying to do with aliases to weak aliases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@204784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-26 06:14:40 +00:00