Commit Graph

25800 Commits

Author SHA1 Message Date
Moritz Roth
a6afad8b33 Thumb1 load/store optimizer: Improve code to materialize new base register.
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.

Differential Revision: http://reviews.llvm.org/D5006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 17:11:03 +00:00
Juergen Ributzka
69ec09b61f [FastISel][AArch64] Remove redundant test.
These tests and many more are already covered by fast-isel-addressing-modes.ll.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 16:40:05 +00:00
Jonathan Roelofs
4c3be1aa0f Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216182 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 14:35:47 +00:00
Benjamin Kramer
daada81e5c DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments.
PR20677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216175 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:28:02 +00:00
Oliver Stannard
760a46522a [ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.

This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 12:50:31 +00:00
Erik Verbruggen
0edc5e8391 Reassociate x + -0.1234 * y into x - 0.1234 * y
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:

  return ((x + 0.1234 * y) * (x - 0.1234 * y));

Differential Revision: http://reviews.llvm.org/D4904



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216169 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 10:45:30 +00:00
Robert Khasanov
fec1abaeab [x86] Added _addcarry_ and _subborrow_ intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216164 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:43:43 +00:00
Robert Khasanov
10dacc4b52 [x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216162 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:27:00 +00:00
Zinovy Nis
164cd0161e [INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:

int N = 100;
float **A;

void foo(int x0, int x1)
{
        float * A_cur = &A[0][0];
        float * A_next = &A[1][0];
        for(int x = x0; x < x1; ++x).
        {
          // Currently only [x+N] case is widened. Others 2 cases lead to sext.
          // This patch fixes it, so all 3 cases do not need sext.
          const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
          A_next[x] = div;
        }
}
...
> clang++ test.cpp -march=core-avx2 -Ofast  -fno-unroll-loops -fno-tree-vectorize -S -o -

Differential Revision: http://reviews.llvm.org/D4695



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216160 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 08:25:45 +00:00
David Majnemer
e234f93b3e InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216157 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 05:14:48 +00:00
Jiangning Liu
82f1a8cc09 Fix a bug around truncating vector in const prop.
In constant folding stage, "TRUNC" can't handle vector data type.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216149 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 02:12:35 +00:00
Jiangning Liu
150cef218f Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216147 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 01:59:30 +00:00
Quentin Colombet
e817bdd304 [PeepholeOptimizer] Take advantage of the isInsertSubreg property in the
advanced copy optimization.

This is the final step patch toward transforming:
udiv    r0, r0, r2
udiv    r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16
bx      lr

into:
udiv    r0, r0, r2
udiv    r1, r1, r3
bx      lr

Indeed, thanks to this patch, this optimization is able to look through
vmov.32 d16[0], r0
vmov.32 d16[1], r1

and is able to rewrite the following sequence:
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16

into simple generic GPR copies that the coalescer managed to remove.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:19:16 +00:00
Jonathan Roelofs
506ed4d4a5 Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.

See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:38:50 +00:00
Yi Jiang
ee1b45f2a2 New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 22:55:40 +00:00
Sanjay Patel
89305e5345 Don't prevent a vselect of constants from becoming a single load (PR20648).
Fix for PR20648 - http://llvm.org/bugs/show_bug.cgi?id=20648

This patch checks the operands of a vselect to see if all values are constants.
If yes, bail out of any further attempts to create a blend or shuffle because
SelectionDAGLegalize knows how to turn this kind of vselect into a single load.

This already happens for machines without SSE4.1, so the added checks just send
more targets down that path.

Differential Revision: http://reviews.llvm.org/D4934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216121 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 20:34:56 +00:00
Duncan P. N. Exon Smith
4641d5dbdf X86: Add missing triples from r216119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216120 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 19:58:59 +00:00
Duncan P. N. Exon Smith
5012f1db20 X86: Align the stack on word boundaries in LowerFormalArguments()
The goal of the patch is to implement section 3.2.3 of the AMD64 ABI
correctly.  The controlling sentence is, "The size of each argument gets
rounded up to eightbytes.  Therefore the stack will always be eightbyte
aligned." The equivalent sentence in the i386 ABI page 37 says, "At all
times, the stack pointer should point to a word-aligned area."  For both
architectures, the stack pointer is not being rounded up to the nearest
eightbyte or word between the last normal argument and the first
variadic argument.

Patch by Thomas Jablin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216119 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 19:40:59 +00:00
Keno Fischer
4b1cddbaf0 Do not insert a tail call when returning multiple values on X86
Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=19530.
The problem is that X86ISelLowering erroneously thought the third call
was eligible for tail call elimination.
It would have been if it's return value was actually the one returned
by the calling function, but here that is not the case and
additional values are being returned.

Test Plan: Test case from the original bug report is included.

Reviewers: rafael

Reviewed By: rafael

Subscribers: rafael, llvm-commits

Differential Revision: http://reviews.llvm.org/D4968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216117 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 19:00:37 +00:00
Sanjay Patel
3deb3e32b2 critical-anti-dependency breaker: don't use reg def info from kill insts (PR20308)
In PR20308 ( http://llvm.org/bugs/show_bug.cgi?id=20308 ), the critical-anti-dependency breaker
caused a miscompile because it broke a WAR hazard using a register that it thinks is available
based on info from a kill inst. Until PR18663 is solved, we shouldn't use any def/use info from
a kill because they are really just nops.

This patch adds guard checks for kills around calls to ScanInstruction() where the DefIndices
array is set. For good measure, add an assert in ScanInstruction() so we don't hit this bug again.

The test case is a reduced version of the code from the bug report.

Differential Revision: http://reviews.llvm.org/D4977



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216114 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 18:03:00 +00:00
Quentin Colombet
dcd3cbea54 [PeepholeOptimizer] Refactor the advanced copy optimization to take advantage of
the isRegSequence property.

This is a follow-up of r215394 and r215404, which respectively introduces the
isRegSequence property and uses it for ARM.

Thanks to the property introduced by the previous commits, this patch is able
to optimize the following sequence:
vmov	d0, r2, r3
vmov	d1, r0, r1
vmov	r0, s0
vmov	r1, s2
udiv	r0, r1, r0
vmov	r1, s1
vmov	r2, s3
udiv	r1, r2, r1
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

This patch refactors how the copy optimizations are done in the peephole
optimizer. Prior to this patch, we had one copy-related optimization that
replaced a copy or bitcast by a generic, more suitable (in terms of register
file), copy.

With this patch, the peephole optimizer features two copy-related optimizations:
1. One for rewriting generic copies to generic copies:
PeepholeOptimizer::optimizeCoalescableCopy.
2. One for replacing non-generic copies with generic copies:
PeepholeOptimizer::optimizeUncoalescableCopy.

The goals of these two optimizations are slightly different: one rewrite the
operand of the instruction (#1), the other kills off the non-generic instruction
and replace it by a (sequence of) generic instruction(s).

Both optimizations rely on the ValueTracker introduced in r212100.

The ValueTracker has been refactored to use the information from the
TargetInstrInfo for non-generic instruction. As part of the refactoring, we
switched the tracking from the index of the definition to the actual register
(virtual or physical). This one change is to provide better consistency with
register related APIs and to ease the use of the TargetInstrInfo.

Moreover, this patch introduces a new helper class CopyRewriter used to ease the
rewriting of generic copies (i.e., #1).

Finally, this patch adds a dead code elimination pass right after the peephole
optimizer to get rid of dead code that may appear after rewriting.

This is related to <rdar://problem/12702965>.

Review: http://reviews.llvm.org/D4874


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216088 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 17:41:48 +00:00
Juergen Ributzka
5273295751 [FastISel][AArch64] Don't fold the sign-/zero-extend from i1 into the compare.
This fixes a bug I introduced in a previous commit (r216033). Sign-/Zero-
extension from i1 cannot be folded into the ADDS/SUBS instructions. Instead both
operands have to be sign-/zero-extended with separate instructions.

Related to <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216073 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 16:34:15 +00:00
Jiangning Liu
e04455d72b Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type
legalization stage. With those two optimizations, fewer signed/zero extension
instructions can be inserted, and then we can expose more opportunities to
Machine CSE pass in back-end.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216066 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 12:05:15 +00:00
Pavel Chupin
aadaac228d [x32] Fix FrameIndex check in SelectLEA64_32Addr
Summary:
Fixes http://llvm.org/bugs/show_bug.cgi?id=20016 reproducible on new
lea-5.ll case.
Also use RSP/RBP for x32 lea to save 1 byte used for 0x67 prefix in
ESP/EBP case.

Test Plan: lea tests modified to include x32/nacl and new test added

Reviewers: nadav, dschuff, t.p.northover

Subscribers: llvm-commits, zinovy.nis

Differential Revision: http://reviews.llvm.org/D4929

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216065 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 11:59:22 +00:00
Yi Kong
40f9d11ccc ARM: Fix codegen for rbit intrinsic
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.

The bug was originally introduced in r211057.

Differential Revision: http://reviews.llvm.org/D4980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216064 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 10:40:20 +00:00
David Majnemer
99e941fd9a InstCombine: Annotate sub with nuw when we prove it's safe
We can prove that a 'sub' can be a 'sub nuw' if the left-hand side is
negative and the right-hand side is non-negative.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216045 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 07:17:31 +00:00
Peter Collingbourne
b3b125aafc [dfsan] Treat vararg custom functions like unimplemented functions.
Because declarations of these functions can appear in places like autoconf
checks, they have to be handled somehow, even though we do not support
vararg custom functions. We do so by printing a warning and calling the
uninstrumented function, as we do for unimplemented functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216042 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 01:40:23 +00:00
Juergen Ributzka
3ef392c4e2 [FastISel][AArch64] Use the proper FMOV instruction to materialize a +0.0.
Use FMOVWSr/FMOVXDr instead of FMOVSr/FMOVDr, which have the proper register
class to be used with the zero register. This makes the MachineInstruction
verifier happy again.

This is related to <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216040 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 01:10:36 +00:00
David Majnemer
e0134d95cc InstCombine: Annotate sub with nsw when we prove it's safe
We can prove that a 'sub' can be a 'sub nsw' under certain conditions:
- The sign bits of the operands is the same.
- Both operands have more than 1 sign bit.

The subtraction cannot be a signed overflow in either case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216037 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 23:36:30 +00:00
Juergen Ributzka
ae9a7964ef [FastISel][AArch64] Factor out ADDS/SUBS instruction emission and add support for extensions and shift folding.
Factor out the ADDS/SUBS instruction emission code into helper functions and
make the helper functions more clever to support most of the different ADDS/SUBS
instructions the architecture support. This includes better immedediate support,
shift folding, and sign-/zero-extend folding.

This fixes <rdar://problem/17913111>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 22:29:55 +00:00
Duncan P. N. Exon Smith
7838818ad7 IR: Implement uselistorder assembly directives
Implement `uselistorder` and `uselistorder_bb` assembly directives,
which allow the use-list order to be recovered when round-tripping to
assembly.

This is the bulk of PR20515.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216025 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 21:30:15 +00:00
Lang Hames
722885f5f1 [MCJIT] Add an i386 RuntimeDyldMachO test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216024 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 21:26:36 +00:00
Duncan P. N. Exon Smith
13f5c5896d verify-uselistorder: Force -preserve-bc-use-list-order
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216022 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 21:08:27 +00:00
Juergen Ributzka
1d58f989d4 [FastISel][AArch64] Extend floating-point materialization test.
This adds the missing test that I promised for r215753 to test the
materialization of the floating-point value +0.0.

Related to <rdar://problem/18027157>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 20:35:07 +00:00
Juergen Ributzka
06bb1ca1e0 Reapply [FastISel][AArch64] Add support for more addressing modes (r215597).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
FastISel didn't take much advantage of the different addressing modes available
to it on AArch64. This commit allows the ComputeAddress method to recognize more
addressing modes that allows shifts and sign-/zero-extensions to be folded into
the memory operation itself.

For Example:
  lsl x1, x1, #3     --> ldr x0, [x0, x1, lsl #3]
  ldr x0, [x0, x1]

  sxtw x1, w1
  lsl x1, x1, #3     --> ldr x0, [x0, x1, sxtw #3]
  ldr x0, [x0, x1]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216013 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:17 +00:00
Juergen Ributzka
96b1e70c66 Reapply [FastISel][X86] Add large code model support for materializing floating-point constants (r215595).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.

Fixes <rdar://problem/17674628>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216012 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:13 +00:00
Juergen Ributzka
9c23685dd2 Reapply [FastISel][X86] Use XOR to materialize the "0" value (r215594).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216011 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:10 +00:00
Juergen Ributzka
e8757c5dbb Reapply [FastISel][X86] Emit more efficient instructions for integer constant materialization (r215593).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.

This fixes <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:06 +00:00
Juergen Ributzka
78f686d37c Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591).
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.

Original commit message:
This change materializes now the value "0" from the zero register.
The zero register can be folded by several instruction, so no
materialization is need at all.

Fixes <rdar://problem/17924413>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:44:02 +00:00
Juergen Ributzka
f08cddcf56 Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588).
Note: This was originally reverted to track down a buildbot error. This commit
exposed a latent bug that was fixed in r215753. Therefore it is reapplied
without any modifications.

I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new
regeressions.

Original commit message:
This changes the order in which FastISel tries to materialize a constant.
Originally it would try to use a simple target-independent approach, which
can lead to the generation of inefficient code.

On X86 this would result in the use of movabsq to materialize any 64bit
integer constant - even for simple and small values such as 0 and 1. Also
some very funny floating-point materialization could be observed too.

On AArch64 it would materialize the constant 0 in a register even the
architecture has an actual "zero" register.

On ARM it would generate unnecessary mov instructions or not use mvn.

This change simply changes the order and always asks the target first if it
likes to materialize the constant. This doesn't fix all the issues
mentioned above, but it enables the targets to implement such
optimizations.

Related to <rdar://problem/17420988>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 19:05:24 +00:00
Renato Golin
8308f0e30f Revert "Small refactor on VectorizerHint for deduplication"
This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 18:08:50 +00:00
Juergen Ributzka
8841fb5f25 [FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register.
This fixes a few BuildMI callsites where the result register was added by
using addReg, which is per default a use and therefore an operand register.

Also use the zero register as result register when emitting a compare
instruction (SUBS with unused result register).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 17:41:53 +00:00
Renato Golin
dca126522d Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215994 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 17:30:43 +00:00
Toma Tabacu
109447ff1b [mips] Add assembler support for .set arch=x directive.
Summary:
This directive is similar to ".set mipsX".
It is used to change the CPU target of the assembler, enabling it to accept instructions for a specific CPU.

This patch only implements the r4000 CPU (which is treated internally as generic mips3) and the generic ISAs.

Contains work done by Matheus Almeida.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4884

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 14:22:52 +00:00
Mayur Pandey
ecdb0ab90f InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B
Proof using CVC3 follows:
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B);
$ cvc3 t.cvc
Valid.

Differential Revision: http://reviews.llvm.org/D4898


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215974 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 08:19:19 +00:00
Akira Hatanaka
6290308366 [X86, X87 stackifier] Do not mark an operand of a debug instruction as kill.
<rdar://problem/16952634>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215962 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-19 02:09:57 +00:00
Duncan P. N. Exon Smith
36c150801a verify-uselistorder: Call verifyModule() and improve output
Call `verifyModule()` after parsing and after every transformation.
Also convert some `DEBUG(dbgs())` to `errs()` to increase visibility
into what's going on.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 23:44:14 +00:00
Robin Morisset
0acd42142a Answer to Philip Reames comments
- add check for volatile (probably unneeded, but I agree that we should be conservative about it).
- strengthen condition from isUnordered() to isSimple(), as I don't understand well enough Unordered semantics (and it also matches the comment better this way) to be confident in the previous behaviour (thanks for catching that one, I had missed the case Monotonic/Unordered).
- separate a condition in two.
- lengthen comment about aliasing and loads
- add tests in GVN/atomic.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215943 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 22:18:14 +00:00
Robin Morisset
6c0e1e0fa6 Weak relaxing of the constraints on atomics in MemoryDependencyAnalysis
Monotonic accesses do not have to kill the analysis, as long as the QueryInstr is not
itself atomic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215942 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 22:18:11 +00:00
Kevin Enderby
f759032ccd Make llvm-objdump handle both arm and thumb disassembly from the same Mach-O
file with -macho, the Mach-O specific object file parser option.

After some discussion I chose to do this implementation contained in the logic
of llvm-objdump’s MachODump.cpp using a second disassembler for thumb when
needed and with updates mostly contained in the MachOObjectFile class.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215931 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 20:21:02 +00:00
Oliver Stannard
eb922109f9 Teach the AArch64 backend to handle f16
This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv
operations on f16 (half-precision) types by promoting to f32.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 14:22:39 +00:00
Oliver Stannard
802d420792 [ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage
Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 12:42:15 +00:00
Elena Demikhovsky
9735ccb7ea AVX-512: Fixed a bug in emitting compare for MVT:i1 type.
Added a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215889 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-18 11:59:06 +00:00
Saleem Abdulrasool
f15492fd72 ARM: improve RTABI 4.2 conformance on Linux
The set of functions defined in the RTABI was separated for no real reason.
This brings us closer to proper utilisation of the functions defined by the
RTABI.  It also sets the ground for correctly emitting function calls to AEABI
functions on all AEABI conforming platforms.

The previously existing lie on the behaviour of __ldivmod and __uldivmod is
propagated as it is beyond the scope of the change.

The changes to the test are due to the fact that we now use the divmod functions
which return both the quotient and remainder and thus we no longer need to
invoke two functions on Linux (making it closer to EABI's behaviour).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 22:51:02 +00:00
Daniel Sanders
5535fca8bf Revert: r215698 - Current implementation of c.cond.fmt instructions only accept default cc0 register...
It causes a number of regressions when -fintegrated-as is enabled. This happens
because there are codegen-only instructions that incorrectly uses the first
operand as the encoding for the $fcc register. The regressions do not occur when
-via-file-asm is also given.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 19:47:47 +00:00
Saleem Abdulrasool
70d641fbec ARM: correct toggling behaviour
This was a thinko.  The intent was to flip the explicit bits that need toggling
rather than all bits.  This would result in incorrect behaviour (which now is
tested).

Thanks to Nico Weber for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 19:20:38 +00:00
Rafael Espindola
78e8d52a58 llvm-objdump: don't print relocations in non-relocatable files.
This matches the behavior of GNU objdump.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215844 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 19:09:37 +00:00
Rafael Espindola
4c511cd88f Fix an off-by-one bug in the target independent llvm-objdump.
It would prevent the display of a single byte instruction before a label.

Patch by Steve King!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215837 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 16:31:39 +00:00
Owen Anderson
7a0201c6a6 Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set.
While this might seem like an obvious canonicalization, there is one subtle problem with it.  The result of the original expression
is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN.  This means that the
new expression is strictly more defined than the original one.  One unfortunate consequence of this is that the transform is not reversible!
It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it.  Thus, targets that prefer the original
form of the expression cannot reverse the transform to recover it.  Another way to think of it is that the transform has lost source-level
information (the fast math flags), which is undesirable.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215825 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-17 03:51:29 +00:00
NAKAMURA Takumi
0ac5626b56 llvm/test/CodeGen/X86/fmul-combines.ll: Appease Windows x64. <4 x float> is passed by stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215821 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 22:28:37 +00:00
Matt Arsenault
5f8a9ae17c Fix fmul combines with constant splat vectors
Fixes things like fmul x, 2 -> fadd x, x

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215820 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 10:14:19 +00:00
Chandler Carruth
a3805f1c73 [x86] Teach lots of the new vector shuffle lowering to use UNPCK
instructions for blend operations at 128 bits. This was a serious hole
in our prior blend lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215819 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 09:42:15 +00:00
David Majnemer
cb698b26a1 InstCombine: Combine mul with div.
We can combne a mul with a div if one of the operands is a multiple of
the other:

%mul = mul nsw nuw %a, C1
%ret = udiv %mul, C2
  =>
%ret = mul nsw %a, (C1 / C2)

This can expose further optimization opportunities if we end up
multiplying or dividing by a power of 2.

Consider this small example:

define i32 @f(i32 %a) {
  %mul = mul nuw i32 %a, 14
  %div = udiv exact i32 %mul, 7
  ret i32 %div
}

which gets CodeGen'd to:

    imull       $14, %edi, %eax
    imulq       $613566757, %rax, %rcx
    shrq        $32, %rcx
    subl        %ecx, %eax
    shrl        %eax
    addl        %ecx, %eax
    shrl        $2, %eax
    retq

We can now transform this into:
define i32 @f(i32 %a) {
  %shl = shl nuw i32 %a, 1
  ret i32 %shl
}

which gets CodeGen'd to:

    leal        (%rdi,%rdi), %eax
    retq

This fixes PR20681.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215815 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 08:55:06 +00:00
Nico Weber
f1aba61bb5 arm asm: Let .fpu enable instructions, PR20447.
I'm not very happy with duplicating the fpu->feature mapping in ARMAsmParser.cpp
and in clang's driver. See the bug for a patch that doesn't do that, and the
review thread [1] for why this duplication exists.

1: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215811 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 05:37:51 +00:00
Duncan P. N. Exon Smith
1d8c9d95bf BitcodeReader: Only create one basic block for each blockaddress
Block address forward-references are implemented by creating a
`BasicBlock` ahead of time that gets inserted in the `Function` when
it's eventually encountered.

However, if the same blockaddress was used in two separate functions
that were parsed *before* the referenced function (and the blockaddress
was never used at global scope), two separate basic blocks would get
created, one of which would be forgotten creating invalid IR.

This commit changes the forward-reference logic to create only one basic
block (and always return the same blockaddress).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 01:54:37 +00:00
Duncan P. N. Exon Smith
7a5cb43115 IR: Don't add inbounds to GEPs of extern_weak variables
Global variables that have `extern_weak` linkage may be null, so it's
incorrect to add `inbounds` when constant folding.

This also fixes a bug when parsing global aliases, whose forward
reference placeholders are global variables with `extern_weak` linkage.
If GEPs to these aliases are encountered before the alias itself, the
GEPs would incorrectly gain the `inbounds` keyword as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215803 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 01:54:32 +00:00
Andrea Di Biagio
89cea3c36b [DAGCombiner] Improve the folding of target independet shuffles to Undef.
When combining a pair of shuffle nodes, check if the combined shuffle mask is
trivially Undef. In case, immediately fold that pair of shuffles to Undef.

The lack of checks for undef masks was the root-cause of a poor-codegen bug
in the dag combiner.

Example:
  %1 = shufflevector <4 x i32> %A, <4 x i32> %B, <4 x i32> <i32 4, i32 1, i32 1, i32 6>
  %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 0, i32 4, i32 1, i32 6>
  %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 5, i32 3, i32 3>

Before this patch, on x86 (with -mcpu=corei7) we failed to fold the entire
sequence to Undef value and therefore we generated:
  shufps $-123, %xmm1, $xmm0
  pshufd $-46, %xmm0, %xmm0

With this patch, the entire shuffle sequence is folded to Undef and no
shuffles are generated in the output assembly.

Added new test cases to test 'combine-vec-shuffle-5.ll'.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215797 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 00:29:44 +00:00
Hal Finkel
5dc48ac04a [PowerPC] Mark fixed-offset byvals as pointed-to by IR values
A byval object, even if allocated at a fixed offset (prescribed by the ABI) is
pointed to by IR values. Most fixed-offset stack objects are not pointed-to by
IR values, so the default is to assume this is not possible. However, we need
to override the default in this case (instruction scheduling can cause
miscompiles otherwise).

Fixes PR20280.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215795 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-16 00:17:05 +00:00
Chad Rosier
cc921d6f41 [AArch32] Add support for FP rounding operations for ARMv8/AArch32.
Phabricator Revision: http://reviews.llvm.org/D4935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215772 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 21:38:16 +00:00
Rafael Espindola
0549fc2448 Set comdats when lazily linking functions.
We were setting the comdat when functions were copied in the initial pass, but
not when they were linked only when we found out that they are needed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 20:17:08 +00:00
Matt Arsenault
c86e55eb6e R600/SI: Move all fabs / fneg handling to patterns
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215749 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 18:42:22 +00:00
Matt Arsenault
0498d07255 R600/SI: Use source modifiers for f64 fneg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215748 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 18:42:18 +00:00
Matt Arsenault
c882fc78fe R600/SI: Use source modifier for f64 fabs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215747 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 18:42:15 +00:00
Matt Arsenault
34ef4cd65b R600/SI: Fix offset folding in some cases with shifted pointers.
Ordinarily (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
is only done if the add has one use. If the resulting constant
add can be folded into an addressing mode, force this to happen
for the pointer operand.

This ends up happening a lot because of how LDS objects are allocated.
Since the globals are allocated next to each other, acessing the first
element of the second object is directly indexed by a shifted pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215739 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:49:05 +00:00
Chandler Carruth
92ee945e2e [x86] Teach the new AVX v4f64 shuffle lowering to use UNPCK instructions
where applicable for blending.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215737 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:42:00 +00:00
Matt Arsenault
5bc44c7603 R600/SI: Add intrinsic for ldexp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:30:25 +00:00
Juergen Ributzka
e2bb4f981b [FastISel][ARM] Fix unit test from r215682.
Thanks Jim for finding this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:23:20 +00:00
Matt Arsenault
ed76ca720b R600/SI: Implement isLegalAddressingMode
The default assumes that a 16-bit signed offset is used.
LDS instruction use a 16-bit unsigned offset, so it wasn't
being used in some cases where it was assumed a negative offset
could be used.

More should be done here, but first isLegalAddressingMode needs
to gain an addressing mode argument. For now, copy most of the rest
of the default implementation with the immediate offset change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215732 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:17:07 +00:00
Moritz Roth
d84561bf69 ARM: Fix and re-enable load/store optimizer for Thumb1.
In a previous iteration of the pass, we would try to compensate for
writeback by updating later instructions and/or inserting a SUBS to
reset the base register if necessary.
Since such a SUBS sets the condition flags it's not generally safe to do
this. For now, only merge LDR/STRs if there is no writeback to the base
register (LDM that loads into the base register) or the base register is
killed by one of the merged instructions. These cases are clear wins
both in terms of instruction count and performance.

Also add three new test cases, and update the existing ones accordingly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215729 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 17:00:30 +00:00
Amara Emerson
cef3ad6720 [AArch64] Narrow arguments passed in wrong position on the stack in
big-endian mode.

Patch by Asiri Rathnayake.

Differential Revision: http://reviews.llvm.org/D4922

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:29:57 +00:00
Rafael Espindola
a348fc7fda Remove HasLEB128.
We already require CFI, so it should be safe to require .leb128 and .uleb128.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215712 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 14:01:07 +00:00
Bill Schmidt
44beebe8de [PPC64] Add test case for r215685.
I had deferred adding this test case until I could get it down to a
reasonable size.  That's done now.

Thanks,
Bill


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215711 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 13:51:57 +00:00
Chandler Carruth
12e69a0267 [x86] Add the initial skeleton of type-based dispatch for AVX vectors in
the new shuffle lowering and an implementation for v4 shuffles.

This allows us to handle non-half-crossing shuffles directly for v4
shuffles, both integer and floating point. This currently misses places
where we could perform the blend via UNPCK instructions, but otherwise
generates equally good or better code for the test cases included to the
existing vector shuffle lowering. There are a few cases that are
entertainingly better. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215702 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 11:01:40 +00:00
Tim Northover
f52efce72d ARM: implement MRS/MSR (banked reg) system instructions.
These are system-only instructions for CPUs with virtualization
extensions, allowing a hypervisor easy access to all of the various
different AArch32 registers.

rdar://problem/17861345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 10:47:12 +00:00
Vladimir Medic
30bb8f60e5 Current implementation of c.cond.fmt instructions only accept default cc0 register. This patch enables the instruction to accept other fcc registers. The aliases with default fcc0 registers are also defined.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 09:29:30 +00:00
Chandler Carruth
886f0101a7 [x86] Fix the very broken formation of vpunpck instructions in the
target-specific shuffl DAG combines.

We were recognizing the paired shuffles backwards. This code needs to be
replaced anyways as we have the same functionality elsewhere, but I'll
do the refactoring in a follow-up, this is the minimal fix to the
behavior.

In addition to fixing miscompiles with the new vector shuffle lowering,
it also causes the canonicalization to kick in much better, selecting
the smaller encoding variants in lots of places in the new AVX path.
This still isn't quite ideal as we don't need both the shufpd and the
punpck instructions, but that'll get fixed in a follow-up patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 03:54:49 +00:00
Chandler Carruth
477f28c48d [x86] Fix PR20540 where the x86 shuffle DAG combiner had completely
broken logic for merging shuffle masks in the face of SM_SentinelZero
mask operands.

While these are '-1' they don't mean 'undef' the way '-1' means in the
pre-legalized shuffle masks. Instead, they mean that the shuffle
operation is forcibly zeroing that lane. Reflect this and explicitly
handle it in a bunch of places. In one place the effect is equivalent
but much more clear. In the rest it was really weirdly broken.

Also, rewrite the entire merging thing to be a more directy operation
with a single loop and just doing math to map the indices through the
various masks.

Also add a bunch of asserts to try to make in extremely clear what the
different masks can possibly look like.

Finally, add some comments to clarify that we're merging shuffle masks
*up* here rather than *down* as we do everywhere else, and thus the
logic is quite confusing.

Thanks to several different people for sending test cases, and for
Robert Khasanov for an initial attempt at fixing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-15 02:43:18 +00:00
Juergen Ributzka
266ecacfaa [FastISel][ARM] Fall-back to constant pool loads when materializing an i32 constant.
FastEmit_i won't always succeed to materialize an i32 constant and just fail.
This would trigger a fall-back to SelectionDAG, which is really not necessary.

This fix will first fall-back to a constant pool load to materialize the constant
before giving up for good.

This fixes <rdar://problem/18022633>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215682 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 23:29:49 +00:00
Hal Finkel
e1e7862f6e Copy noalias metadata from call sites to inlined instructions
When a call site with noalias metadata is inlined, that metadata can be
propagated directly to the inlined instructions (only those that might access
memory because it is not useful on the others). Prior to inlining, the noalias
metadata could express that a call would not alias with some other memory
access, which implies that no instruction within that called function would
alias. By propagating the metadata to the inlined instructions, we preserve
that knowledge.

This should complete the enhancements requested in PR20500.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215676 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 21:09:37 +00:00
Juergen Ributzka
6398a7f5fd Revert several FastISel commits to track down a buildbot error.
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 19:56:28 +00:00
Adam Nemet
41c5e687ed [AVX512] Add test for FMA masking instrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215665 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 17:13:33 +00:00
Adam Nemet
90eb948fc9 [AVX512] Switch FMA intrinsics to the masking version
This does the renaming and updates the lowering logic.

Part of <rdar://problem/17688758>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215664 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 17:13:30 +00:00
Juergen Ributzka
14bc045838 Revert "[FastISel][AArch64] Add support for more addressing modes."
This reverts commits r215597, because it might have broken the build bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 17:10:54 +00:00
Hal Finkel
b1b9953473 Add noalias metadata for general calls (not just memory intrinsics) during inlining
When preserving noalias function parameter attributes by adding noalias
metadata in the inliner, we should do this for general function calls (not just
memory intrinsics). The logic is very similar to what already existed (except
that we want to add this metadata even for functions taking no relevant
parameters). This metadata can be used by ModRef queries in the caller after
inlining.

This addresses the first part of PR20500. Adding noalias metadata during
inlining is still turned off by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215657 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 16:44:03 +00:00
Chad Rosier
3b41039163 [Reassociation] Add support for reassociation with unsafe algebra.
Vector instructions are (still) not supported for either integer or floating
point.  Hopefully, that work will be landed shortly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215647 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 15:23:01 +00:00
Sanjay Patel
9615d702ad optimize vector fneg of bitcasted integer value
This patch allows a vector fneg of a bitcasted integer value to be optimized in the same way that we already optimize a scalar fneg. If the integer variable is a constant, we can precompute the result and not require any logic ops.

This patch is very similar to a fabs patch committed at r214892.

Differential Revision: http://reviews.llvm.org/D4852



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215646 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 15:15:28 +00:00
Rafael Espindola
1c9caced63 Delete support for AuroraUX.
auroraux.org is not resolving.

I will add this to the release notes as soon as I figure out where to put the
3.6 release notes :-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215645 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 15:15:09 +00:00
Toma Tabacu
0b2081a05a [mips] Improve robustness of some tests.
Summary:
This is done by removing some hardcoded registers like $at or expecting a single digit register to be selected.

Contains work done by Matheus Almeida.

Reviewers: matheusalmeida, dsanders

Reviewed By: dsanders

Subscribers: tomatabacu

Differential Revision: http://reviews.llvm.org/D4227

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 13:10:48 +00:00
Chandler Carruth
cad1711154 [x86] Begin stubbing out the AVX support in the new vector shuffle
lowering scheme.

Currently, this just directly bails to the fallback path of splitting
the 256-bit vector into two 128-bit vectors, operating there, and then
joining the results back together. While the results are far from
perfect, they are *shockingly* good for what we're doing here. I'll be
layering the rest of the functionality on top of this piece by piece and
updating tests as I go.

Note that 256-bit vectors in this mode are still somewhat WIP. While
I think the code paths that I'm adding here are clean and good-to-go,
there are still a lot of 128-bit assumptions that I'll need to stomp out
as I march through the functional spread here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215637 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 12:13:59 +00:00
Toma Tabacu
cb43f81fc5 [mips] Add assembler support for the "la $reg,symbol" pseudo-instruction.
Summary:
This pseudo-instruction allows the programmer to load an address from a symbolic expression into a register.

Patch by David Chisnall.
His work was sponsored by: DARPA, AFRL

I've made some minor changes to the original, such as improving the formatting and adding some comments, and I've also added a test case.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D4808

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215630 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-14 10:29:17 +00:00