Commit Graph

29747 Commits

Author SHA1 Message Date
Reid Kleckner
d1807ff318 [WinEH] Replace more lpad value uses with undef
We were asserting on code like this:
  extern "C" unsigned long _exception_code();
  void might_crash(unsigned long);
  void foo() {
    __try {
      might_crash(0);
    } __except(1) {
      might_crash(_exception_code());
    }
  }

Gtest and many other libraries get the exception code from the __except
block. What's supposed to happen here is that EAX is live into the
__except block, and it contains the exception code. Eventually we'll
represent that as a use of the landingpad ehptr value, but for now we
can replace it with undef.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235649 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 21:22:30 +00:00
Quentin Colombet
c364314ec3 [MachineCopyPropagation] Handle undef flags conservatively so that we do not
remove copies that are useful after breaking some hardware dependencies.
In other words, handle this kind of situations conservatively by assuming reg2
is redefined by the undef flag.
reg1 = copy reg2
= inst reg2<undef>
reg2 = copy reg1
Copy propagation used to remove the last copy.
This is incorrect because the undef flag on reg2 in inst, allows next
passes to put whatever trashed value in reg2 that may help.
In practice we end up with this code:
reg1 = copy reg2
reg2 = 0
= inst reg2<undef>
reg2 = copy reg1

This fixes PR21743.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235647 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 21:17:39 +00:00
Tom Stellard
2aab32cade R600/SI: Fix indirect addressing with a negative constant offset
When the base register index of the vector plus the constant offset
was less than zero, we were passing the wrong base register to the indirect
addressing instruction.

In this case, we need to set the base register to v0 and then add
the computed (negative) index to m0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235641 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:32:01 +00:00
Peter Collingbourne
391b2c39f7 Thumb2: When applying branch optimizations, visit branches in reverse order.
The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.

Differential Revision: http://reviews.llvm.org/D9185

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:35 +00:00
Peter Collingbourne
1ad0f74155 ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.

This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.

Differential Revision: http://reviews.llvm.org/D9184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235639 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:32 +00:00
Peter Collingbourne
d9a479e5a0 Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero.
This allows the constant island pass to lower these branches to cbn?z
instructions, resulting in a shorter instruction sequence.

Differential Revision: http://reviews.llvm.org/D9183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235638 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:30 +00:00
Peter Collingbourne
f86c29ea2c ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets.
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.

Differential Revision: http://reviews.llvm.org/D9165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:26 +00:00
Peter Collingbourne
b28abbf98b ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").

Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.

Differential Revision: http://reviews.llvm.org/D9138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:22 +00:00
Adam Nemet
50b9e7f7d4 [getUnderlyingOjbects] Analyze loop PHIs further to remove false positives
Specifically, if a pointer accesses different underlying objects in each
iteration, don't look through the phi node defining the pointer.

The motivating case is the underlyling-objects-2.ll testcase.  Consider
the loop nest:

  int **A;
  for (i)
    for (j)
       A[i][j] = A[i-1][j] * B[j]

This loop is transformed by Load-PRE to stash away A[i] for the next
iteration of the outer loop:

  Curr = A[0];          // Prev_0
  for (i: 1..N) {
    Prev = Curr;        // Prev = PHI (Prev_0, Curr)
    Curr = A[i];
    for (j: 0..N)
       Curr[j] = Prev[j] * B[j]
  }

Since A[i] and A[i-1] are likely to be independent pointers,
getUnderlyingObjects should not assume that Curr and Prev share the same
underlying object in the inner loop.

If it did we would try to dependence-analyze Curr and Prev and the
analysis of the corresponding SCEVs would fail with non-constant
distance.

To fix this, the getUnderlyingObjects API is extended with an optional
LoopInfo parameter.  This is effectively what controls whether we want
the above behavior or the original.  Currently, I only changed to use
this approach for LoopAccessAnalysis.

The other testcase is to guard the opposite case where we do want to
look through the loop PHI.  If we step through an array by incrementing
a pointer, the underlying object is the incoming value of the phi as the
loop is entered.

Fixes rdar://problem/19566729

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235634 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:09:20 +00:00
Jingyue Wu
12f341611a [NVPTX] run SeparateConstOffsetFromGEP before SLSR
Summary:
We pick this order because SeparateConstOffsetFromGEP may create more
opportunities for SLSR.

Test Plan:
reassociate-geps-and-slsr.ll
no performance regression on internal benchmarks

Reviewers: meheff

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:00:04 +00:00
Tom Stellard
e32631cecd R600/SI: Add missing -mcpu=SI to assembler test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235630 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:55 +00:00
Tom Stellard
59edae9b85 R600/SI: Add assembler support for all CI and VI VOP1 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:54 +00:00
Tom Stellard
95081f5241 R600/SI: Improve AsmParser support for forced e64 encoding
We can now force e64 encoding even when the operands would be legal
for e32 encoding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235626 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:48 +00:00
Reid Kleckner
70e56ae6b3 Revert "[SEH] Remove the old __C_specific_handler code now that WinEHPrepare works"
We still have some "uses remain after removal" issues in -O0 builds.

This reverts commit r235557.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235617 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 18:34:01 +00:00
Hal Finkel
184f8f7c10 [PowerPC] Enable printing instructions using aliases
TableGen had been nicely generating code to print a number of instructions using
shorter aliases (and PowerPC has plenty of short mnemonics), but we were not
calling it. For some of the aliases we support in the parser, TableGen can't
infer the "inverse" alias relationship, so there is still more to do.

Thus, after some hours of updating test cases...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235616 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar
dab5145cb3 [AArch64] Add nvcast patterns for v4f16 and v8f16
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants.  This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Reviewers: jmolloy, srhines, ab

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar
b7db5f28c5 [AArch64] Handle vec4, vec8, vec16 *itofp for half
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.

Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors.  Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.

The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64.  I am
adding a test here for completeness.

Reviewers: aemerson, rengolin, ab, jmolloy, srhines

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:16:27 +00:00
Hans Wennborg
defaf830f9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235608 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:45:24 +00:00
Sanjay Patel
08aea0a553 use update_llc_test_checks.py to tighten checking; remove unnecessary CPU param
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235604 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:07:50 +00:00
Krzysztof Parzyszek
de0d4bf1d4 [Hexagon] Shrink-wrap stack frame (Hexagon-specific)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235603 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:05:39 +00:00
Krzysztof Parzyszek
69c69df308 [Hexagon] Add testcases for stack alignment and variable-sized objects
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235602 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 15:12:49 +00:00
Aaron Ballman
5d538f71c2 Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235597 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 13:41:59 +00:00
Filipe Cabecinhas
0236022390 Be more strict about the operand for the array type in BitcodeReader
Summary: Bug found with AFL fuzz.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235596 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 13:38:21 +00:00
Filipe Cabecinhas
81f9bd3e19 Verify sizes when trying to read a BitcodeAbbrevOp
Summary:
Make sure the abbrev operands are valid and that we can read/skip them
afterwards.

Bug found with AFL fuzz.

Reviewers: rafael

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9030

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235595 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 13:25:35 +00:00
Simon Pilgrim
77aa4a8c4d [DAGCombiner] Remove extra bitcasts surrounding vector shuffles
Patch to remove extra bitcasts from shuffles, this is often a legacy of XformToShuffleWithZero being used to combine bitmaskings (of float vectors bitcast to integer vectors) into shuffles: bitcast(shuffle(bitcast(s0),bitcast(s1))) -> shuffle(s0,s1)

Differential Revision: http://reviews.llvm.org/D9097

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235578 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 08:43:13 +00:00
Karthik Bhat
7ab8b5573e Add support to interchange loops with reductions.
This patch enables interchanging of tightly nested loops with reductions.
Differential Revision: http://reviews.llvm.org/D8314


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235571 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 04:51:44 +00:00
Andrew Kaylor
de625b674b [WinEH] Removing seh-filter.ll until I can determine its validity
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235566 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 00:38:22 +00:00
Andrew Kaylor
a1df0a3120 [WinEH] Don't skip landing pads that end with an unreachable instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235563 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 00:20:44 +00:00
Hans Wennborg
395f4f4b2a Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:

- Switches with only a default case and non-fallthrough were handled incorrectly

- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.

> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 23:14:56 +00:00
David Majnemer
f9c92b069a [InstCombine] Use a more targeted fix instead of r235544
Only clear out the NSW/NUW flags if we are optimizing 'add'/'sub' while
taking advantage that the sign bit is not set.  We do this optimization
to further shrink the mask but shrinking the mask isn't NSW/NUW
preserving in this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235558 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 22:42:05 +00:00
Reid Kleckner
d9b72fea11 [SEH] Remove the old __C_specific_handler code now that WinEHPrepare works
This removes the -sehprepare flag and makes __C_specific_handler
functions always to use WinEHPrepare.

This was tested by building all of chromium_builder_tests and running a
few tests that use SEH, but if something breaks, we can revert this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235557 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 22:13:09 +00:00
Krzysztof Parzyszek
391b60ce58 Unxfail passing test on Hexagon
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235556 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 21:41:24 +00:00
Krzysztof Parzyszek
bbe056c9bc [Hexagon] Some cleanup of instruction selection code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235552 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 21:17:00 +00:00
Reid Kleckner
de3495610d [WinEH] Demote values and phis live across exception handlers up front
In particular, this handles SSA values that are live *out* of a handler.
The existing code only handles values that are live *in* to a handler.

It also handles phi nodes in the block where normal control should
resume after the end of a catch handler.  When EH return points have phi
nodes, we need to split the return edge. It is impossible for phi
elimination to emit copies in the previous block if that block gets
outlined. The indirectbr that we leave in the function is only notional,
and is eliminated from the MachineFunction CFG early on.

Reviewers: majnemer, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D9158

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235545 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 21:05:21 +00:00
David Majnemer
3bd87826e5 [InstCombine] Clear out nsw/nuw if we modify computation in the chain
An nsw/nuw operation relies on the values feeding into it to not
overflow if 'poison' is not to be produced.  This means that
optimizations which make modifications to the bottom of a chain (like
SimplifyDemandedBits) must strip out nsw/nuw if they cannot ensure that
they will be preserved.

This fixes PR23309.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 20:59:28 +00:00
Krzysztof Parzyszek
3c55df1e84 [Hexagon] Use A2_tfrsi for constant pool and jump table addresses
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235535 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 18:25:53 +00:00
Pirama Arumuga Nainar
1d50fea817 Fix correctness check for test_vec_fpextend_double
Summary:
Remove the CHECK-DAG calls introduced in r235341, and add a comment that
this test may break due to scheduling variations.

This patch completes the fix discussed in http://reviews.llvm.org/D8804

Reviewers: dsanders, srhines

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9178

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235530 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 18:04:12 +00:00
Matt Arsenault
a37c0d278b R600: Fix always inline pass breaking noinline functions
No test since calls are not actually supported yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235524 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 17:10:44 +00:00
Sanjay Patel
3f1f6571cc [x86] Add store-folded memop patterns for vcvtps2ph
Differential Revision: http://reviews.llvm.org/D7296



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235517 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 16:11:19 +00:00
Adhemerval Zanella
c728e851dc Support arm32 R_ARM_V4BX relocation format
ARM32 ELF R_ARM_V4BX relocation format is a special relocation type
that records the location of an ARMv4t BX instruction to enable a
static linker to generate ARMv4 compatible instructions.  This
relocation does not contain a reference symbol.

This patch enabled its creation by removing the requeriment of a
relocation symbol target in ELFState<ELFT>::writeSectionContent.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 15:26:43 +00:00
Brendon Cahoon
8b94db17a4 Fix a type mismatch assert in SCEV division
An assert was triggered when attempting to create a new SCEV
with operands of different types in the visitAddRecExpr. In this
test case, the operand types of the numerator and denominator
are different. The SCEV division code should generate a
conservative answer when this happens.

Differential Revision: http://reviews.llvm.org/D9021


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 15:06:40 +00:00
Andrea Di Biagio
6c347524e2 [X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST nodes (PR23259).
This fixes a regression introduced at revision 218263.

On AVX, if we optimize for size, a splat build_vector of a load
is lowered into a VBROADCAST node. This is done even if the value type of the
splat build_vector node is v2i64.

Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two
extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast
where the scalar element comes from a loadi64/loadf64.

However, revision 218263 forgot to add an extra fallback pattern for the case
where we have a X86VBroadcast of a loadi64 with multiple uses.

This patch adds the missing tablegen pattern in X86InstrSSE.td.
This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel
doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern
to select v2i64 X86ISD::BROADCAST nodes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 14:53:39 +00:00
Hal Finkel
61ffda59f9 [DAGCombine] Disable select(c, load,load) for indexed loads
This turned up after r235333, but was a pre-existing bug. The optimization
which transforms select(c, load, load) into a load of a select of the addresses
does not handle indexed loads (pre/post inc/dec). However, it did not check for
them either, leading to a crash if it tried to transform one of them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235497 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 11:32:25 +00:00
Vasileios Kalintiris
21249eca6b Revert "[mips][FastISel] Implement shift ops for Mips fast-isel."
This reverts commit r235194. It was causing a failure in FastISel buildbots
due to sign-extension issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 10:08:46 +00:00
James Molloy
89cc8dd3b8 [AArch64] Disable complex GEP optimization by default.
Enough concerns were raised that this optimization is pessimising some code patterns.

The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235491 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 09:11:38 +00:00
Filipe Cabecinhas
e16cac587a Have more strict type checks when creating BinOp nodes in BitcodeReader
Summary: Bug found with AFL.

Reviewers: rafael, bkramer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9015

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235489 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 09:06:21 +00:00
Lang Hames
a1c0ce8518 [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the
X86 backend.

The code generated for symbolic targets is identical to the code generated for
constant targets, except that a relocation is emitted to fix up the actual
target address at link-time. This allows IR and object files containing
patchpoints to be cached across JIT-invocations where the target address may
change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 06:02:31 +00:00
Duncan P. N. Exon Smith
fae374b95e Linker: Add flag to override linkage rules
Add a flag to lib/Linker (and `llvm-link`) to override linkage rules.
When set, the functions in the source module *always* replace those in
the destination module.

The `llvm-link` option is `-override=abc.ll`.  All the "regular" modules
are loaded and linked first, followed by the `-override` modules.  This
is useful for debugging workflows where some subset of the module (e.g.,
a single function) is extracted into a separate file where it's
optimized differently, before being merged back in.

Patch by Luqman Aden!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235473 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 04:11:00 +00:00
Sanjay Patel
2b2b3a87da [x86] allow 64-bit extracted vector element integer stores on a 32-bit system
With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system
even though 64-bit integers are not legal types.

So instead of producing this:

  pshufd	$229, %xmm0, %xmm1      ## xmm1 = xmm0[1,1,2,3]
  movd	%xmm0, (%eax)
  movd	%xmm1, 4(%eax)

We can do:

  movq %xmm0, (%eax)

This is a fix for the problem noted in D7296.

Differential Revision: http://reviews.llvm.org/D9134



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 00:24:30 +00:00
Reid Kleckner
8992ead662 [WinEH] Correctly handle inlined __finally blocks with captures
We should also teach the inliner to collapse framerecover of
frameaddress of the current frame down to an alloca, but that can happen
later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235459 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 00:07:52 +00:00
NAKAMURA Takumi
bc4233f437 Remove a zero-length file of llvm/test/Transforms/InstCombine/descale-zero.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235457 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 23:14:33 +00:00
Wei Mi
ef67950b62 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimization,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D8911


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 23:02:15 +00:00
Wei Mi
480fc70c43 Revert r235451 since it is attached to a wrong Differential Revision. Sorry.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235453 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 22:56:09 +00:00
Wei Mi
73a5fa9ad6 Limiting gep merging to fix the performance problem described in
https://llvm.org/bugs/show_bug.cgi?id=23163.

Gep merging sometimes behaves like a reverse CSE/LICM optimizations,
which has negative impact on performance. In this patch we restrict
gep merging to happen only when the indexes to be merged are both consts,
which ensures such merge is always beneficial.

The patch makes gep merging only happen in very restrictive cases.
It is possible that some analysis/optimization passes rely on the merged
geps to get better result, and we havn't notice them yet. We will be ready
to further improve it once we see the cases.

Differential Revision: http://reviews.llvm.org/D9007


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 22:37:09 +00:00
Ahmed Bougacha
0f32a037ef [MemCpyOpt] Use the raw i8* dest when optimizing memset+memcpy.
MemIntrinsic::getDest() looks through pointer casts, and using it
directly when building the new GEP+memset results in stuff like:

  %0 = getelementptr i64* %p, i32 16
  %1 = bitcast i64* %0 to i8*
  call ..memset(i8* %1, ...)

instead of the correct:

  %0 = bitcast i64* %p to i8*
  %1 = getelementptr i8* %0, i32 16
  call ..memset(i8* %1, ...)

Instead, use getRawDest, which just gives you the i8* value.
While there, use the memcpy's dest, as it's live anyway.

In most cases, when the optimization triggers, the memset and memcpy
sizes are the same, so the built memset is 0-sized and eliminated.
The problem occurs when they're different.

Fixes a regression caused by r235232: PR23300.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235419 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 21:28:33 +00:00
Krzysztof Parzyszek
a42f6b9a58 [Hexagon] Patterns for frame index with offset for isel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235418 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 21:28:03 +00:00
Jingyue Wu
423476d899 [SLSR] garbage-collect unused instructions
Summary:
After we rewrite a candidate, the instructions used by the old form may
become unused. This patch cleans up these unused instructions so that we
needn't run DCE after SLSR.

Test Plan: removed -dce in all the SLSR tests

Reviewers: broune, meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9101

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235410 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 19:56:18 +00:00
Jingyue Wu
412b6fc7b9 [SeparateConstOffsetFromGEP] garbage-collect intermediate instructions
Summary: so that we needn't run DCE after this pass.

Test Plan: removed -dce from the commandline in split-gep.ll and split-gep-and-gvn.ll

Reviewers: meheff

Subscribers: llvm-commits, HaoLiu, hfinkel, jholewinski

Differential Revision: http://reviews.llvm.org/D9096

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 19:53:18 +00:00
Reid Kleckner
405cc64eac Re-land r235154-r235156 under the existing -sehprepare flag
Keep the old SEH fan-in lowering on by default for now, since projects
rely on it.  This will make it easy to test this change with a simple
flag flip.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 18:23:57 +00:00
Matthias Braun
9e0a1565b9 X86: Match for X86ISD nodes in LowerBUILD_VECTOR instead of BUILD_VECTORCombine
There doesn't seem to be a reason to perform this target ISD node matching
in an DAGCombine, moving it to lowering fixes PR23296.

Differential Revision: http://reviews.llvm.org/D9137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 17:21:36 +00:00
Elena Demikhovsky
bf704ed348 AVX-512: Added VPMOVx2M instructions for SKX,
fixed encoding of VPMOVM2x.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235385 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 14:38:31 +00:00
Elena Demikhovsky
695922de3d AVX-512: Added VPTESTM and VPTESTNM instructions for SKX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235383 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 13:13:46 +00:00
Toma Tabacu
203a9224ff [mips] [IAS] Implement the .asciiz directive.
Summary:
This directive is exactly the same as .asciz, except it's only used by MIPS.
It is used to store null terminated strings in object files.

Reviewers: rafael, dsanders, echristo

Reviewed By: dsanders, echristo

Subscribers: echristo, llvm-commits

Differential Revision: http://reviews.llvm.org/D7530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235382 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 11:50:52 +00:00
Jozef Kolek
c589d1b3bc [mips][microMIPSr6] Implement CACHE and PREF instructions
Implement CACHE and PREF instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8893


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235379 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 11:17:25 +00:00
Vasileios Kalintiris
d72ba1af57 [mips] Optimize code generation for 64-bit variable shift instructions.
Summary:
The 64-bit version of the variable shift instructions uses the
shift_rotate_reg class which uses a GPR32Opnd to specify the variable
shift amount. With this patch we avoid the generation of a redundant
SLL instruction for the variable shift instructions in 64-bit targets.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235376 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 10:49:03 +00:00
Elena Demikhovsky
a1fa0de258 AVX-512: Added logical and arithmetic instructions for SKX
by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235375 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 10:27:40 +00:00
Simon Pilgrim
01eaaa72bf [X86][SSE] Provide execution domains for scalar floating point operations
This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since.

I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests.

Differential Revision: http://reviews.llvm.org/D9095

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 08:40:22 +00:00
Simon Pilgrim
d2f0700f15 CONCAT_VECTOR of BUILD_VECTOR - minor fix
Fixed issue with the combine of CONCAT_VECTOR of 2 BUILD_VECTOR nodes - the optimisation wasn't ensuring that the scalar operands of both nodes were the same type/size for implicit truncation.

Test case spotted by Patrik Hagglund

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235371 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 08:05:43 +00:00
Pawel Bylica
775c174b7b Fix generic shift expansion when shift amount is 0
Summary:
This fixes http://llvm.org/bugs/show_bug.cgi?id=16439. 

This is one possible way to approach this. The other would be to split InL>>(nbits-Amt) into (InL>>(nbits-1-Amt))>>1, which is also valid since since we only need to care about Amt up nbits-1. It's hard to tell which one is better since the shift might be expensive if this stage of expansion is not yet a legal machine integer, whereas comparisons with zero are relatively cheap at all sizes, but more expensive than a shift if the shift is on a legal machine type. 

Patch by Keno Fischer!

Test Plan: regression test from http://reviews.llvm.org/D7752

Reviewers: chfast, resistor

Reviewed By: chfast, resistor

Subscribers: sanjoy, resistor, chfast, llvm-commits

Differential Revision: http://reviews.llvm.org/D4978



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235370 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 06:28:36 +00:00
Matthias Braun
6fbedc4cfd X86: Do not select X86 custom vector nodes if operand types don't match
X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected
if the operand types do not match the result type because vector type
legalization cannot deal with this for custom nodes.

Testcase X86ISD::ADDSUB is attached. I could not create a testcase for
the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296

Differential Revision: http://reviews.llvm.org/D9120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235367 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 01:13:41 +00:00
Derek Schuff
9b56994421 Tighten bundling section alignment test.
Leftover comment from http://reviews.llvm.org/D9131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 00:17:59 +00:00
Derek Schuff
a49508cb92 [MC] When using bundle aligment, align sections to bundle size
Summary:
Bundle aligment requires that the functions always start at an aligned address.
Usually this is ensured by the compiler, but assembly code does not always
begin with a .align directive.

This change ensures that sections get the correct alignment if they contain
any instructions and bundling is enabled. (It also makes LLVM match the
behavior of GNU as).

Differential Revision: http://reviews.llvm.org/D9131

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 00:14:25 +00:00
Fiona Glaser
b5750565de InstCombine: fold (sitofp (zext x)) to (uitofp x)
This is okay because the zext guarantees the high bit is zero,
and so the value is unsigned.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235364 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 00:05:41 +00:00
Pirama Arumuga Nainar
1aebbfac0a Fix flakiness in fp16-promote.ll
Summary:
In the f16-promote test, make the checks for native conversion instructions
similar to the libcall checks:
- Remove hard coded register names
- Do not check exact instruction sequences.

This fixes test flakiness due to non-determinism in instruction
scheduling and register allocation.  I also fixed a few minor things in
the CHECK-LIBCALL checks.

I'll try to find a way to check that unnecessary loads, stores, or
conversions don't happen.

Reviewers: mzolotukhin, srhines, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9112

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235363 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 23:54:41 +00:00
JF Bastien
7b862ec88e bugpoint Enhancement.
Summary:
This patch adds two flags to `bugpoint`: "-replace-funcs-with-null" and "-disable-pass-list-reduction".

When "-replace-funcs-with-null" is specified, bugpoint will, instead of simply deleting function bodies, replace all uses of functions and then will delete functions completely from the test module, correctly handling aliasing and @llvm.used && @llvm.compiler.used. This part was conceived while trying to debug the PNaCl IR simplification passes, which don't allow undefined functions (ie no declarations).

With "-disable-pass-list-reduction", bugpoint won't try to reduce the set of passes causing the "crash". This is needed in cases where one is trying to debug an issue inside the PNaCl IR simplification passes which is causing an PNaCl ABI verification error, for example.

Reviewers: jfb

Reviewed By: jfb

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D8555

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235362 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 23:42:22 +00:00
Sanjay Patel
af337cd20e use update_llc_test_checks.py to tighten checking
Also, replace win and linux runs with a generic run because that
makes no difference in what this test is checking.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235361 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 23:31:53 +00:00
Andrew Kaylor
e0d6a5c90a [WinEH] Fix problem with mapping shared empty handler blocks.
Differential Revision: http://reviews.llvm.org/D9125



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235354 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 22:04:09 +00:00
Olivier Sallenave
d153d3d8cc Refactoring and enhancement to FMA combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235344 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 20:29:40 +00:00
Andrew Kaylor
d485ac5538 Fixing line endings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235342 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 20:27:28 +00:00
Pirama Arumuga Nainar
9d1b182f81 [MIPS] OperationAction for FP_TO_FP16, FP16_TO_FP
Summary:
Set operation action for FP16 conversion opcodes, so the Op legalizer
can choose the gnu_* libcalls for Mips.

Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to
prevent (fpext (load )) and (store (fptrunc)) from getting combined into
unsupported operations.

Added test cases to test that these operations are handled correctly
for f16 scalars and vectors.  This patch depends on
http://reviews.llvm.org/D8755.

Reviewers: srhines

Subscribers: llvm-commits, ab

Differential Revision: http://reviews.llvm.org/D8804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235341 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 20:15:36 +00:00
Tom Stellard
4eccd9814f DAGCombine: Remove redundant NaN checks around ISD::FSQRT
This folds:

(select (setcc x, -0.0, *lt), NaN, (fsqrt x)) -> ( fsqrt x)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235333 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 19:38:27 +00:00
Jozef Kolek
382bee5224 [mips][microMIPSr6] Implement BITSWAP instruction
Implement BITSWAP instruction using mapping.

Differential Revision: http://reviews.llvm.org/D8857


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235321 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 18:14:59 +00:00
Vladimir Sukharev
d1e387b9e6 [AArch64] LORID_EL1 register must be treated as read-only
Patch by: John Brawn

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9105


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235314 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 16:54:37 +00:00
Akira Hatanaka
ca9313e65e [InlineFunction] Don't add lifetime markers for zero-sized allocas.
This commit fixes the code which adds lifetime markers in InlineFunction to skip
zero-sized allocas instead of asserting on them.

rdar://problem/20531155


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235312 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 16:11:05 +00:00
Brendon Cahoon
d9b36e1007 Recognize n/1 in the SCEV divide function
n/1 generates a quotient equal to n and a remainder of 0.
If this case is not recognized, then the SCEV divide() function
can return a remainder that is greater than or equal to the
denominator, which means the delinearized subscripts for the
test case will be incorrect.

Differential Revision: http://reviews.llvm.org/D9003


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235311 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 16:03:28 +00:00
Jozef Kolek
fc4915076f [mips][microMIPSr6] Implement disassembler support
Implement disassembler support for microMIPS32r6.

Differential Revision: http://reviews.llvm.org/D8490


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 14:40:38 +00:00
Jozef Kolek
dbef0175c3 [mips][microMIPSr6] Implement BALC and BC instructions
This patch implements BALC and BC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8388


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 13:04:14 +00:00
Rafael Espindola
ca3837369f Look past locals in comdats.
We have to avoid converting a reference to a global into a reference to a local,
but it is fine to look past a local.

Patch by Vasileios Kalintiris.

I just moved the comment and added thet test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 12:44:06 +00:00
Andrea Di Biagio
14fc08301c [X86][FastIsel] Fix assertion failure when selecting int-to-double conversion (PR23273).
This fixes a regression introduced at revision 231243.
The target-independent selection algorithm in FastISel knows how to select
a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the
tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr.

Method X86FastISel::X86SelectSIToFP was therefore working under the
wrong assumption that the target was AVX. That assumption was incorrect since
we can have a target that is neither AVX nor SSE.

So, rather than asserting for the presence of AVX, we should have had an
early exit from 'X86SelectSIToFP' if the target was not AVX.
This patch fixes the issue replacing the invalid assertion with an early exit.

Thanks to Dimitry Andric for reporting this problem and for providing a small
reproducible testcase. Added test pr23273.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235295 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 11:56:59 +00:00
Simon Atanasyan
23d47d346d [Mips] Support DT_MIPS_OPTIONS dynamic section tag in the llvm-readobj
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235285 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 05:34:48 +00:00
Hal Finkel
6809724935 [InlineAsm] Remove EarlyClobber on registers that are also inputs
When an inline asm call has an output register marked as early-clobber, but
that same register is also an input operand, what should we do? GCC accepts
this, and is documented to accept this for read/write operands saying,
"Furthermore, if the earlyclobber operand is also a read/write operand, then
that operand is written only after it's used." For write-only operands, the
situation seems less clear, but I have at least one existing codebase that
assumes this will work, in part because it has syscall macros like this:

({                                                                         \
  register uint64_t r0 __asm__ ("r0") = (__NR_ ## name);                   \
  register uint64_t r3 __asm__ ("r3") = ((uint64_t) (arg0));               \
  register uint64_t r4 __asm__ ("r4") = ((uint64_t) (arg1));               \
  register uint64_t r5 __asm__ ("r5") = ((uint64_t) (arg2));               \
  __asm__ __volatile__                                                     \
  ("sc"                                                                    \
   : "=&r"(r0),"=&r"(r3),"=&r"(r4),"=&r"(r5)                               \
   :   "0"(r0),  "1"(r3),  "2"(r4),  "3"(r5)                               \
   : "r6","r7","r8","r9","r10","r11","r12","cr0","memory");                \
  r3;                                                                      \
})

Furthermore, with register aliases and subregister relationships that only the
backend knows about, rejecting this in the frontend seems like a difficult
proposition (if we wanted to do so). However, keeping the early-clobber flag on
the INLINEASM MI does not work for us, because it will cause the register's
live interval to end to soon (so it will not appear defined to be used as an
input).

Fortunately, fixing this does not seem hard: When forming the INLINEASM MI,
check to see if any of the early-clobber outputs are also inputs, and if so,
remove the early-clobber flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235283 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 00:01:30 +00:00
Simon Pilgrim
ca3e6fafc8 [X86][SSE] Fix for getScalarValueForVectorElement to detect scalar sources requiring truncation.
The fix ensures that scalar sources inserted into a vector are the correct bit size.

Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235281 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-19 22:16:49 +00:00
Simon Pilgrim
e398eb753a [X86][SSE] Extended copysign tests to include llvm intrinsic implementation and constant folding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235279 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-19 21:34:57 +00:00
Ahmed Bougacha
3a65b111b5 [MemCpyOpt] Don't force i64 when promoting memset/memcpy sizes.
Harden r235258 to support any integer bitwidth.  The quick glance at
the reference made me think only i32 and i64 were valid types, but
they're not special, so any overload is legal.

Thanks to David Majnemer for noticing!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235261 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 23:06:04 +00:00
Simon Pilgrim
4ac6a63687 [X86][AVX2] Force execution domain on broadcast folding tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235260 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 21:24:16 +00:00
Simon Pilgrim
48ef68c206 [X86][SSE] Force execution domain on float/double unpack shuffle tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235259 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 18:50:55 +00:00
Ahmed Bougacha
ebb4371478 [MemCpyOpt] Promote both memset/memcpy sizes if differently typed.
Followup to r235232, which caused PR23278.

We can't assume the memset and memcpy sizes have the same type, as
nothing in the language reference prevents that.
Instead, zext both to i64 if they disagree.

While there, robustify tests by using i8 %c rather than i8 0 for the
memset character.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 17:57:41 +00:00
David Majnemer
c5edbea4e7 [InstCombine] (mul nsw 1, INT_MIN) != (shl nsw 1, 31)
Multiplying INT_MIN by 1 doesn't trigger nsw.  However, shifting 1 into
the sign bit *does* trigger nsw.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 04:41:30 +00:00
Ahmed Bougacha
35e8055393 [GlobalMerge] Look at uses to create smaller global sets.
Instead of merging everything together, look at the users of
GlobalVariables, and try to group them by function, to create
sets of globals used "together".

Using that information, a less-aggressive alternative is to keep merging
everything together *except* globals that are only ever used alone, that
is, those for which it's clearly non-profitable to merge with others.

In my testing, grouping by Function is too aggressive, but grouping by
BasicBlock is too conservative.  Anything in-between isn't trivially
available, so stick with Function grouping for now.

cl::opts are added for testing; both enabled by default.

A few of the testcases aren't testing the merging proper, but just
various edge cases when merging does occur.  Update them to use the
previous grouping behavior. Also, one of the tests is unrelated to
GlobalMerge; change it accordingly.
While there, switch to r234666' flags rather than the brutal -O3.

Differential Revision: http://reviews.llvm.org/D8070


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-18 01:21:58 +00:00
Ahmed Bougacha
6b96a388ed [AArch64] Don't force MVT::Untyped when selecting LD1LANEpost.
The result is either an Untyped reg sequence, on ldN with N > 1, or
just the type of the input vector, on ld1.  Don't force Untyped.
Instead, just use the type of the reg sequence.

This mirrors the behavior of createTuple, which feeds the LD1*_POST.

The narrow code path wasn't actually covered by tests, because V64
insert_vector_elt are widened to V128 before the LD1LANEpost combine
has the chance to run, usually.

The only case where it does run on V64 vectors is if the vector ops
legalizer ran.  So, tickle the code with a ctpop.

Fixes PR23265.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235243 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 23:43:33 +00:00