Commit Graph

6152 Commits

Author SHA1 Message Date
Sanjay Patel
0b654728bc use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234029 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:17:50 +00:00
Sanjay Patel
c070a8f6ba use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234027 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:13:31 +00:00
Sanjay Patel
a3ba15a9c9 use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:09:37 +00:00
Sanjay Patel
5703d9f0b8 use update_llc_test_checks.py to tighten checking
remove redundant and unnecessary test parameters


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234022 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:02:48 +00:00
Sanjay Patel
f58fa53c99 add checks; remove redundant testing parameters
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234020 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:44:42 +00:00
Sanjay Patel
8fea4abae0 use update_llc_test_checks.py to tighten checking; remove darwin and sandybridge overspecification
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234017 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:06:58 +00:00
Simon Pilgrim
af519fbf42 Added vector tests for DAGCombiner::ReassociateOps
Missing vector tests for rL233482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234015 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 15:04:46 +00:00
Simon Pilgrim
be149a8148 [X86] Added SSE4.2 CRC32 memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234013 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 14:24:40 +00:00
Simon Pilgrim
e5ecd32488 [X86][3DNow] Added 3DNow! memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234008 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 11:50:30 +00:00
Simon Pilgrim
cbe57104de [X86][MMX] Added MMX stack folding tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234006 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 11:01:15 +00:00
Simon Pilgrim
4e60da755a [DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR
This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed.

Differential Revision: http://reviews.llvm.org/D8516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 10:02:21 +00:00
Sanjay Patel
5b93ab6cde [AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector
Without this patch, we split the 256-bit vector into halves and produced something like:
	movzwl	(%rdi), %eax
	vmovd	%eax, %xmm0
	vxorps	%xmm1, %xmm1, %xmm1
	vblendps	$15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]

Now, we eliminate the xor and blend because those zeros are free with the vmovd:
        movzwl  (%rdi), %eax
        vmovd   %eax, %xmm0

This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233941 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 20:21:52 +00:00
Sanjay Patel
8765e82c83 [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)
For code like this:

define <8 x i32> @load_v8i32() {
  ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}

We produce this AVX code:

_load_v8i32:                            ## @load_v8i32
  movl	$7, %eax
  vmovd	%eax, %xmm0
  vxorps	%ymm1, %ymm1, %ymm1
  vblendps	$1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
  retq

There are at least 2 bugs in play here:

    We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
    We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.

The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.

[1] AddedComplexity has close to no documentation in the source. 
The best we have is this comment: "roughly corresponds to the number of nodes that are covered". 
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). 

I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ

Differential Revision: http://reviews.llvm.org/D8794



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 17:56:17 +00:00
Elena Demikhovsky
4eb165220f AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233906 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:51:40 +00:00
Philip Reames
c47a3ae7d5 Teach gcroot how to handle dynamically realigned frames
I'm playing with supporting custom stack map formats with statepoints.  While 
doing so, I noticed that the existing implementation didn't indicate inherently 
unsized frames.  This change essentially just ports the functionality that already 
exists for the default StackMaps section to custom stackmaps.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 05:00:40 +00:00
Benjamin Kramer
a066ed09db [X86] Don't accidentally select shll $1, %eax when shrinking an immediate.
addl has higher throughput and this was needlessly picking a suboptimal
encoding causing PR23098.

I wish there was a way of doing this without further duplicating tbl-
generated patterns, but so far I haven't found one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233832 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 19:01:09 +00:00
Sanjay Patel
1b10376319 [X86, AVX] fix zero-extending integer operand load patterns to use integer instructions
This is a follow-on to r233704 and another partial fix for PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233724 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 18:43:43 +00:00
Sanjay Patel
7ea151449d [X86, AVX] try to lowerVectorShuffleAsElementInsertion() for all 256-bit vector sub-types
I suggested this change in D7898 (http://llvm.org/viewvc/llvm-project?view=revision&revision=231354)

It improves the v4i64 case although not optimally. This AVX codegen:

  vmovq {{.*#+}} xmm0 = mem[0],zero
  vxorpd %ymm1, %ymm1, %ymm1
  vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]

Becomes:

  vmovsd {{.*#+}} xmm0 = mem[0],zero

Unfortunately, this doesn't completely solve PR22685. There are still at least 2 problems under here:

    We're not handling v32i8 / v16i16.
    We're not getting the FP / int domains right for instruction selection.

But since this patch alone appears to do no harm, reduces code duplication, and helps v4i64, 
I'm submitting this patch ahead of fixing the above.

Differential Revision: http://reviews.llvm.org/D8341



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233704 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 16:32:11 +00:00
Ahmed Bougacha
18128bd376 [X86] Generate MOVNT for all vector types.
We used to miss non-Q YMM integer vectors, and, non-Q/D XMM integer
vectors.
While there, change the v4i32 patterns to prefer MOVNTDQ.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233668 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 03:16:51 +00:00
Paul Robinson
103d622517 Verify 'optnone' can run DAG combiner when appropriate.
Adds a test to verify the behavior that r233153 restored: 'optnone'
does not spuriously disable the DAG combiner, and in fact there are
cases where the DAG combiner must run (even at -O0 or 'optnone') in
order for codegen to succeed.

Differential Revision: http://reviews.llvm.org/D8614


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 19:37:44 +00:00
Simon Pilgrim
d8f2918479 [X86] Ensure integer domain on scalar i64 load/store stack folding tests. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 15:25:51 +00:00
Elena Demikhovsky
10e73aeede AVX-512: blank lines, duplicated tests, no functional changes
see comments http://reviews.llvm.org/D6835


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233528 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 09:29:28 +00:00
Elena Demikhovsky
f5f12f1e92 AVX-512: added intrinsics for VPAND, VPOR and VPXOR
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233525 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 08:30:34 +00:00
Benjamin Kramer
155328790b [inline asm] Don't reject duplicated matching constraints
They're harmless and it's easy to generate them from clang, leading to
a crash in LLVM. Found by afl-fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-29 20:33:07 +00:00
Duncan P. N. Exon Smith
d397a52305 DebugInfo: Fix testcases with invalid MDSubprogram nodes
Fix testcases that don't pass the verifier after a WIP patch to check
`MDSubprogram` operands more effectively.  I found the following issues:

  - When `isDefinition: false`, the `variables:` field might point at
    `!{i32 786468}`, or at a tuple that pointed at an empty tuple with
    the comment "previously: invalid DW_TAG_base_type" (I vaguely recall
    adding those comments during an upgrade script).  In these cases, I
    just dropped the array.
  - The `variables:` field might point at something like `!{!{!8}}`,
    where `!8` was an `MDLocation`.  I removed the extra layer of
    indirection.
  - Invalid `type:` (not an `MDSubroutineType`).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 02:26:45 +00:00
Duncan P. N. Exon Smith
77a0728d96 DebugInfo: Fix bad debug info for compile units and types
Fix debug info in these tests, which started failing with a WIP patch to
verify compile units and types.  The problems look like they were all
caused by bitrot.  They fell into these categories:

  - Using `!{i32 0}` instead of `!{}`.
  - Using `!{null}` instead of `!{}`.
  - Using `!MDExpression()` instead of `!{}`.
  - Using `!8` instead of `!{!8}`.
  - `file:` references that pointed at `MDCompileUnit`s instead of the
    same `MDFile` as the compile unit.
  - `file:` references that were numerically off-by-one or (off-by-ten).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:46:33 +00:00
Quentin Colombet
1ad854adba [RegisterCoalescer] Refine the terminal rule to still consider the terminal
nodes.
When a node is terminal it is pushed at the end of the list of the copies to
coalesce instead of being completely ignored. In effect, this reduces its
priority over non-terminal nodes.

Because of that, we do not miss the rematerialization opportunities, nor the
copies that can be merged with more complex, than the terminal rule,
interference checks.

Related to PR22768.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233395 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 18:37:15 +00:00
Duncan P. N. Exon Smith
2cee1c9d3c LLParser: Require non-null scope for MDLocation and MDLocalVariable
Change `LLParser` to require a non-null `scope:` field for both
`MDLocation` and `MDLocalVariable`.  There's no need to wait for the
verifier for this check.  This also allows their `::getImpl()` methods
to assert that the incoming scope is non-null.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 17:56:39 +00:00
Rafael Espindola
121eb4257d Close unique sections when switching away from them.
It is not possible to switch back to unique secitons, so close them
automatically when switching away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 15:01:40 +00:00
Philip Reames
76acab3fc7 Require a GC strategy be specified for functions which use gc.statepoint
This was discussed a while back and I left it optional for migration.  Since it's been far more than the 'week or two' that was discussed, time to actually make this manditory.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 05:09:33 +00:00
Philip Reames
28ffcd3f1e Allow explicit spill slots to be specified for a gc.statepoint
This patch adds support for explicitly provided spill slots in the GC arguments of a gc.statepoint.  This is somewhat analogous to gcroot, but leverages the STATEPOINT MI node and StackMap infrastructure.  The motivation for this is:
1) The stack spilling code for gc.statepoints hasn't advanced as fast as I'd like.  One major option is to give up on doing spilling in the backend and do it at the IR level instead.  We'd give up the ability to have gc values in registers, but that's a minor cost in practice.  We are not neccessarily moving in that direction, but having the ability to prototype such a thing cheaply is interesting.
2) I want to port the gcroot lowering to use the statepoint infastructure.  Given the metadata printers for gcroot expect a fixed set of stack roots, it's easiest to just reuse the explicit stack slots and pass them directly to the underlying statepoint.  

I'm holding off on the documentation for the new feature until I'm reasonable sure this is going to stick around.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233356 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:52:48 +00:00
Andrew Trick
b827c4b923 Reintroduce the SelectionDAG scheduler test for r233351.
This test returns nonnative integer types which aren't supported on all targets.
The real issue with the SelectionDAG scheduler is with x86 EFLAGS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233355 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:42:52 +00:00
Duncan P. N. Exon Smith
97fcbf1874 DebugInfo: Update testcases with invalid variables
Fix testcases whose variables are invalid.  I'm working on a patch that
adds `Verifier` checks for `MDLocalVariable` (and `MDGlobalVariable`),
and these failed because:

  - `scope:` fields need to point at `MDLocalScope` and can't be null.
  - `file:` fields need to point at `MDFile`.
  - `inlinedAt:` fields need to point at `MDLocation`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233349 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 01:58:34 +00:00
Andrea Di Biagio
971eb1b382 [X86][FastIsel] Teach how to select vector load instructions.
This patch teaches fast-isel how to select 128-bit vector load instructions.
Added test CodeGen/X86/fast-isel-vecload.ll

Differential Revision: http://reviews.llvm.org/D8605


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 11:29:02 +00:00
Quentin Colombet
45f376c4e6 [RegisterCoalescer] Add a rule to consider more profitable copies first when
those are in the same basic block.
The previous approach was the topological order of the basic block.

By default this rule is disabled.

Related to PR22768.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233241 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 01:01:48 +00:00
Simon Pilgrim
eb32048f80 [DAGCombiner] Add support for TRUNCATE + FP_EXTEND vector constant folding
This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match.

It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions.

Differential Revision: http://reviews.llvm.org/D8593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233224 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 22:30:31 +00:00
Sanjay Patel
e53dbeb2ad [X86, AVX] improve insertion into zero element of 256-bit vector
This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.

For f32, instead of:

   vblendps	$1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
   vblendps	$15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]

we get:

   vblendps	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]

For f64, instead of:

   vmovsd	%xmm1, %xmm0, %xmm1     ## xmm1 = xmm1[0],xmm0[1]
   vblendpd	$3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]

we get:

   vblendpd	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]

For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.

Differential Revision: http://reviews.llvm.org/D8609



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233199 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:36:01 +00:00
Sanjay Patel
62dd074fe9 use update_llc_test_checks.py to tighten checking in these tests
1. There were no CHECK-LABELs, so we could match instructions from the wrong function.
2. The use of zero operands meant multiple xor instructions could match some CHECKs.
3. The test was over-specified to need a Sandybridge CPU and Darwin triple.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233198 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:34:11 +00:00
Andrea Di Biagio
3132115738 [X86] Simplify check lines in tests. No functional change.
Also, removed unused check lines from test atomic6432.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233181 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 11:44:19 +00:00
Paul Robinson
759015d80f 'optnone' should not disable DAG combiner.
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233153 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 00:10:24 +00:00
Reid Kleckner
d639e1975d X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:46:01 +00:00
Sanjay Patel
fe76881930 [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)
vperm2x128 instructions have the special ability (aka free hardware capability)
to shuffle zero values into a vector.

This patch recognizes that type of shuffle and generates the appropriate
control byte.

https://llvm.org/bugs/show_bug.cgi?id=22984

Differential Revision: http://reviews.llvm.org/D8563



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233100 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 19:19:07 +00:00
Simon Pilgrim
fa17ce8b6e [SelectionDAG] Fixed issue with uitofp vector constant folding being treated as sitofp
While the uitofp scalar constant folding treats an integer as an unsigned value (from lang ref):

%X = sitofp i8 -1 to double ; yields double:-1.0
%Y = uitofp i8 -1 to double ; yields double:255.0

The vector constant folding was always using sitofp:

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>

This patch fixes this so that the correct opcode is used for sitofp and uitofp.

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double 255.0, double 255.0>

Differential Revision: http://reviews.llvm.org/D8560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233033 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 22:44:55 +00:00
Duncan P. N. Exon Smith
7ad96398c6 DebugInfo: Overload get() in DIDescriptor subclasses
Continue to simplify the `DIDescriptor` subclasses, so that they behave
more like raw pointers.  Remove `getRaw()`, replace it with an
overloaded `get()`, and overload the arrow and cast operators.  Two
testcases started to crash on the arrow operators with this change
because of `scope:` references that weren't real scopes.  I fixed them.
Soon I'll add verifier checks for them too.

This also adds explicit dereference operators.  Previously, the builtin
dereference against `operator MDNode *()` would have worked, but now the
builtins are ambiguous.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233030 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 21:54:07 +00:00
Simon Pilgrim
c58f32d981 Tidied up vec_zero_cse.ll test. NFCI.
Added target triple and refactored the CHECKs to be per function.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232894 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 14:05:12 +00:00
Eric Christopher
6f125f52d3 Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from
TargetMachine go ahead and flip the switch on caching the function
dependent subtarget and remove the bare getSubtargetImpl call
from the X86 port. As part of this add a few tests that show we
can generate code and assemble on X86 based on features/cpu on
the Function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232879 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 03:13:10 +00:00
Sanjay Patel
39110ecd35 [X86] Prefer blendps over insertps codegen for one special case
With this patch, for this one exact case, we'll generate:

  blendps %xmm0, %xmm1, $1

instead of:

  insertps %xmm0, %xmm1, $0

If there's a memory operand available for load folding and we're
optimizing for size, we'll still generate the insertps.

The detailed performance data motivation for this may be found in D7866; 
in summary, blendps has 2-3x throughput vs. insertps on widely used chips.

Differential Revision: http://reviews.llvm.org/D8332



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 21:19:52 +00:00
Daniel Jasper
70b146b25e [MBP] Don't outline short optional branches
With the option -outline-optional-branches, LLVM will place optional
branches out of line (more details on r231230).

With this patch, this is not done for short optional branches. A short
optional branch is a branch containing a single block with an
instruction count below a certain threshold (defaulting to 3). Still
everything is guarded under -outline-optional-branches).

Outlining a short branch can't significantly improve code locality. It
can however decrease performance because of the additional jmp and in
cases where the optional branch is hot. This fixes a compile time
regression I have observed in a benchmark.

Review: http://reviews.llvm.org/D8108

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232802 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 10:00:37 +00:00
Sanjay Patel
11d77223a5 [X86, AVX] use blends instead of insert128 with index 0
Another case of x86-specific shuffle strength reduction:
avoid generating insert*128 instructions with index 0 because
they are slower than their non-lane-changing blend equivalents.

Shuffle lowering already catches most of these cases, but
the zero vector case and some other paths such as in the
modified test in vector-shuffle-256-v32.ll were getting
through.

Differential Revision: http://reviews.llvm.org/D8366


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232773 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-19 22:29:40 +00:00
Simon Pilgrim
4c38456ead Fixed failing test due to missing target triple causing different results on different buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232685 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-18 22:51:45 +00:00