Commit Graph

28004 Commits

Author SHA1 Message Date
Tim Northover
8cd39a2630 Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before
doing Load PRE"

It's not really expected to stick around, last time it provoked a weird LTO
build failure that I can't reproduce now, and the bot logs are long gone. I'll
re-revert it if the failures recur.

Original description: Perform Scalar PRE on gep indices that feed loads before
doing Load PRE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 19:19:56 +00:00
Daniel Sanders
8d7b0bdcf0 [mips] Add support for accessing $gp as a named register.
Summary:
Mips Linux uses $gp to hold a pointer to thread info structure and accesses it
with a named register. This makes this work for LLVM.

The N32 ABI doesn't quite work yet since the frontend generates incorrect IR
for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before
converting to a pointer. Given correct IR (as in the testcase in this patch),
it works correctly.

Reviewers: sstankovic, vmedic, atanasyan

Reviewed By: atanasyan

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6893

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 17:21:30 +00:00
Hal Finkel
139bfee84c [PowerPC] Enable late partial unrolling on the POWER7
The P7 benefits from not have really-small loops so that we either have
multiple dispatch groups in the loop and/or the ability to form more-full
dispatch groups during scheduling. Setting the partial unrolling threshold to
44 seems good, empirically, for the P7. Compared to using no late partial
unrolling, this yields the following test-suite speedups:

SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding
	-66.3253% +/- 24.1975%
SingleSource/Benchmarks/Misc-C++/oopack_v1p8
	-44.0169% +/- 29.4881%
SingleSource/Benchmarks/Misc/pi
	-27.8351% +/- 12.2712%
SingleSource/Benchmarks/Stanford/Bubblesort
	-30.9898% +/- 22.4647%

I've speculatively added a similar setting for the P8. Also, I've noticed that
the unroller does not quite calculate the unrolling factor correctly for really
tiny loops because it neglects to account for the fact that not every loop body
replicant contains an ending branch and counter increment. I'll fix that later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 15:51:16 +00:00
Saleem Abdulrasool
c2a1df7125 ARM: add support for R_ARM_ABS16
Add support for R_ARM_ABS16 relocation mapping.  Addresses PR22156.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225510 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 06:57:24 +00:00
Saleem Abdulrasool
466a7dea9b test: add additional test for SVN r225507
Add an additional test case to ensure that we generate the relocation even if
the thumb target is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 06:57:18 +00:00
Saleem Abdulrasool
ea4fe48b22 ARM: add support for R_ARM_ABS8 relocations
Add support for R_ARM_ABS8 relocation.  Addresses PR22126.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 05:59:12 +00:00
Matthias Braun
c41acffe22 RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness
The code that eliminated additional coalescable copies in
removeCopyByCommutingDef() used MergeValueNumberInto() which internally
may merge A into B or B into A. In this case A and B had different Def
points, so we have to reset ValNo.Def to the intended one after merging.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 03:01:31 +00:00
Hal Finkel
4e98296890 [PowerPC] Fold [sz]ext with fp_to_int lowering where possible
On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32
to i64 into the load necessary for an i64 -> fp conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225493 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-09 01:34:30 +00:00
Duncan P. N. Exon Smith
3408708548 Utils: Keep distinct MDNodes distinct in MapMetadata()
Create new copies of distinct `MDNode`s instead of following the
uniquing `MDNode` logic.

Just like self-references (or other cycles), `MapMetadata()` creates a
new node.  In practice most calls use `RF_NoModuleLevelChanges`, in
which case nothing is duplicated anyway.

Part of PR22111.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225476 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith
f416d72973 IR: Add 'distinct' MDNodes to bitcode and assembly
Propagate whether `MDNode`s are 'distinct' through the other types of IR
(assembly and bitcode).  This adds the `distinct` keyword to assembly.

Currently, no one actually calls `MDNode::getDistinct()`, so these nodes
only get created for:

  - self-references, which are never uniqued, and
  - nodes whose operands are replaced that hit a uniquing collision.

The concept of distinct nodes is still not quite first-class, since
distinct-ness doesn't yet survive across `MapMetadata()`.

Part of PR22111.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 22:38:29 +00:00
Hal Finkel
b7c01bf403 [PowerPC] Mark all instructions as non-cheap for MachineLICM
MachineLICM uses a callback named hasLowDefLatency to determine if an
instruction def operand has a 'low' latency. If all relevant operands have a
'low' latency, the instruction is considered too cheap to hoist out of loops
even in low-register-pressure situations. On PowerPC cores, both the embedded
cores and the others, there is no reason to believe that this is a good choice:
all instructions have a cost inside a loop, and hoisting them when not limited
by register pressure is a reasonable default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 22:11:49 +00:00
Akira Hatanaka
40cd57eb5c [ARM] Fix a bug in constant island pass that was triggering an assertion.
The assert was being triggered when the distance between a constant pool entry
and its user exceeded the maximally allowed distance after thumb2 branch
shortening. A padding was inserted after a thumb2 branch instruction was shrunk,
which caused the user to be out of range. This is wrong as the padding should
have been inserted by the layout algorithm so that the distance between two
instructions doesn't grow later during thumb2 instruction optimization.

This commit fixes the code in ARMConstantIslands::createNewWater to call
computeBlockSize and set BasicBlock::Unalign when a branch instruction is
inserted to create new water after a basic block. A non-zero Unalign causes
the worst-case padding to be inserted when adjustBBOffsetsAfter is called to
recompute the basic block offsets.

rdar://problem/19130476


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 20:44:50 +00:00
Matt Arsenault
3b1f741856 Fix fcmp + fabs instcombines when using the intrinsic
This was only handling the libcall. This is another example
of why only the intrinsic should ever be used when it exists.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 20:09:34 +00:00
Lang Hames
1b3d915de6 [MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout
strings from the copies that remain.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 18:52:15 +00:00
Rafael Espindola
8aab70ebfe Make this test a bit stricter.
It now checks for the end of the line or the opening '{'.
While at it, remove empty comments.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 16:11:18 +00:00
Justin Hibbits
77e85a150c Add saving and restoring of r30 to the prologue and epilogue, respectively
Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register).  This does that.

Test Plan: Tests updated.

Reviewers: hfinkel

Reviewed By: hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6876

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225450 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 15:47:19 +00:00
Kristof Beyls
d1cee9b3bc Fix large stack alignment codegen for ARM and Thumb2 targets
This partially fixes PR13007 (ARM CodeGen fails with large stack
alignment): for ARM and Thumb2 targets, but not for Thumb1, as it
seems stack alignment for Thumb1 targets hasn't been supported at
all.

Producing an aligned stack pointer is done by zero-ing out the lower
bits of the stack pointer. The BIC instruction was used for this.
However, the immediate field of the BIC instruction only allows to
encode an immediate that can zero out up to a maximum of the 8 lower
bits. When a larger alignment is requested, a BIC instruction cannot
be used; llvm was silently producing incorrect code in this case.

This commit fixes code generation for large stack aligments by
using the BFC instruction instead, when the BFC instruction is
available.  When not, it uses 2 instructions: a right shift,
followed by a left shift to zero out the lower bits.

The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code
that unconditionally uses BIC to realign the stack pointer, so it
very likely has the same problem. However, I wasn't able to
produce a test case for that. This commit adds an assert so that
the compiler will fail the assert instead of silently generating
wrong code if this is ever reached.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225446 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 15:09:14 +00:00
Tom Stellard
9a6e4f08fe R600/SI: Remove SIISelLowering::legalizeOperands()
Its functionality has been replaced by calling
SIInstrInfo::legalizeOperands() from
SIISelLowering::AdjstInstrPostInstrSelection() and running the
SIFoldOperands and SIShrinkInstructions passes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225445 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 15:08:17 +00:00
Elena Demikhovsky
6e8b53da17 Masked Load/Store - fixed a bug in type legalization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 12:29:19 +00:00
Michael Kuperstein
477eba5f81 Fix a think-o in the test for r225438.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225440 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 12:05:02 +00:00
Michael Kuperstein
0858c28ca8 [X86] Don't try to generate direct calls to TLS globals
The call lowering assumes that if the callee is a global, we want to emit a direct call.
This is correct for regular globals, but not for TLS ones.

Differential Revision: http://reviews.llvm.org/D6862

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225438 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 11:50:58 +00:00
Craig Topper
cb964a5c58 Fix test case I missed in r225432.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225434 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 07:57:27 +00:00
Craig Topper
367b67df3e [X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 07:41:30 +00:00
Adrian Prantl
7e44a65e6b Revert "Reapply: Teach SROA how to update debug info for fragmented variables."
This reverts commit r225379 while investigating an assertion failure reported
by Alexey.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225424 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 02:02:00 +00:00
Quentin Colombet
9d60e0ff0a [RegAllocGreedy] Introduce a late pass to repair broken hints.
A broken hint is a copy where both ends are assigned different colors. When a
variable gets evicted in the neighborhood of such copies, it is likely we can
reconcile some of them.


** Context **

Copies are inserted during the register allocation via splitting. These split
points are required to relax the constraints on the allocation problem. When
such a point is inserted, both ends of the copy would not share the same color
with respect to the current allocation problem. When variables get evicted,
the allocation problem becomes different and some split point may not be
required anymore. However, the related variables may already have been colored.

This usually shows up in the assembly with pattern like this:
def A
...
save A to B
def A
use A
restore A from B
...
use B

Whereas we could simply have done:
def B
...
def A
use A
...
use B


** Proposed Solution **

A variable having a broken hint is marked for late recoloring if and only if
selecting a register for it evict another variable. Indeed, if no eviction
happens this is pointless to look for recoloring opportunities as it means the
situation was the same as the initial allocation problem where we had to break
the hint.

Finally, when everything has been allocated, we look for recoloring
opportunities for all the identified candidates.
The recoloring is performed very late to rely on accurate copy cost (all
involved variables are allocated).
The recoloring is simple unlike the last change recoloring. It propagates the
color of the broken hint to all its copy-related variables. If the color is
available for them, the recoloring uses it, otherwise it gives up on that hint
even if a more complex coloring would have worked.

The recoloring happens only if it is profitable. The profitability is evaluated
using the expected frequency of the copies of the currently recolored variable
with a) its current color and b) with the target color. If a) is greater or
equal than b), then it is profitable and the recoloring happen.


** Example **

Consider the following example:
BB1:
  a =
  b =
BB2:
  ...
   = b
   = a
Let us assume b gets split:
BB1:
  a =
  b =
BB2:
  c = b
  ...
  d = c
  = d
  = a
Because of how the allocation work, b, c, and d may be assigned different
colors. Now, if a gets evicted to make room for c, assuming b and d were
assigned to something different than a.
We end up with:
BB1:
  a =
  st a, SpillSlot
  b =
BB2:
  c = b
  ...
  d = c
  = d
  e = ld SpillSlot
  = e
This is likely that we can assign the same register for b, c, and d,
getting rid of 2 copies.


** Performances **

Both ARM64 and x86_64 show performance improvements of up to 3% for the
llvm-testsuite + externals with Os and O3. There are a few regressions too that
comes from the (in)accuracy of the block frequency estimate.

<rdar://problem/18312047>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225422 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-08 01:16:39 +00:00
Matthias Braun
a065cf13cd RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases.
I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a
subregister index in SrcReg/DstReg, but they are actually subregister
indices of the coalesced register that get you back to SrcReg/DstReg
when applied.

Fixed the bug, improved comments and simplified code accordingly.

Testcase by Tom Stellard!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 23:58:38 +00:00
Philip Reames
a7f8f932a6 [GC] improve testing around gc.relocate and fix a test
Patch by: Ramkumar Ramachandra <artagnon@gmail.com>

"This patch started out as an exploration of gc.relocate, and an attempt
to write a simple test in call-lowering. I then noticed that the
arguments of gc.relocate were not checked fully, so I went in and fixed
a few things. Finally, the most important outcome of this patch is that
my new error handling code caught a bug in a callsite in
stackmap-format."

Differential Revision: http://reviews.llvm.org/D6824



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225412 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 22:48:01 +00:00
Tom Stellard
a36b682c17 R600/SI: Commute instructions to enable more folding opportunities
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225410 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 22:44:19 +00:00
Tom Stellard
a3ee583339 R600/SI: Only fold immediates that have one use
Folding the same immediate into multiple instruction will increase
program size, which can hurt performance.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225405 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 22:18:27 +00:00
Duncan P. N. Exon Smith
c742e3a68d Linker: Don't use MDNode::replaceOperandWith()
`MDNode::replaceOperandWith()` changes all instances of metadata.  Stop
using it when linking module flags, since (due to uniquing) the flag
values could be used by other metadata.

Instead, use new API `NamedMDNode::setOperand()` to update the reference
directly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225397 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:32:27 +00:00
Alexey Samsonov
ec1494b99f XFAIL several MCJIT EH tests under ASan and MSan bootstrap.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225393 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:27:26 +00:00
Rafael Espindola
5061ecc615 Add a test that would have found the issue in r224935.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225385 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:10:25 +00:00
Kevin Enderby
60e9ca4c0f Slightly refactor things for llvm-objdump and the -macho option so it can be used with
options other than just -disassemble so that universal files can be used with other
options combined with -arch options.

No functional change to existing options and use.  One test case added for the
additional functionality with a universal file an a -arch option.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225383 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 21:02:18 +00:00
Olivier Sallenave
033a537a84 More FMA folding opportunities.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:54:17 +00:00
Adrian Prantl
50bf54ccf4 Reapply: Teach SROA how to update debug info for fragmented variables.
The two buildbot failures were addressed in LLVM r225378 and CFE r225359.

This rapplies commit 225272 without modifications.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225379 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:52:22 +00:00
Adrian Prantl
7950596b55 Debug info: Allow aggregate types to be described by constants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:48:58 +00:00
Colin LeMahieu
51817073b3 [Hexagon] Adding floating point classification and creation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225374 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:28:57 +00:00
Tom Stellard
f7587043ef R600/SI: Add a V_MOV_B64 pseudo instruction
This is used to simplify the SIFoldOperands pass and make it easier to
fold immediates.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225373 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:27:25 +00:00
Colin LeMahieu
22ddfae848 [Hexagon] Adding encodings for v5 floating point instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:24:09 +00:00
Colin LeMahieu
22efbc70a7 [Hexagon] Adding encoding for popcount, fastcorner, dword asr with rounding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225371 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 20:07:28 +00:00
Tom Stellard
546520a727 R600/SI: Teach SIFoldOperands to split 64-bit constants when folding
This allows folding of sequences like:

s[0:1] = s_mov_b64 4
v_add_i32 v0, s0, v0
v_addc_u32 v1, s1, v1

into

v_add_i32 v0, 4, v0
v_add_i32 v1, 0, v1

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225369 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:56:17 +00:00
Philip Reames
28fa9e1e9f Introduce an example statepoint GC strategy
This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.)

Most of the validation logic added here as proof of concept will soon move in to the Verifier.  That move is dependent on http://reviews.llvm.org/D6811

There was discussion in the review thread about addrspace(1) being reserved for something.  I'm going to follow up on a seperate llvmdev thread.  If needed, I'll update all the code at once.

Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others.

Differential Revision: http://reviews.llvm.org/D6808



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 19:07:50 +00:00
David Majnemer
60812c05e7 X86: Allow the stack probe size to be configurable per function
LLVM emits stack probes on Windows targets to ensure that the stack is
correctly accessed.  However, the amount of stack allocated before
emitting such a probe is hardcoded to 4096.

It is desirable to have this be configurable so that a function might
opt-out of stack probes.  Our level of granularity is at the function
level instead of, say, the module level to permit proper generation of
code after LTO.

Patch by Andrew H!

N.B.  The inliner needs to be updated to properly consider what happens
after inlining a function with a specific stack-probe-size into another
function with a different stack-probe-size.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225360 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 18:14:07 +00:00
Ahmed Bougacha
d412c608fc [X86] Teach FCOPYSIGN lowering to recognize constant magnitudes.
For code like:
    float foo(float x) { return copysign(1.0, x); }
We used to generate:
    andps  <-0.000000e+00,0,0,0>, %xmm0
    movss  <1.000000e+00>, %xmm1
    andps  <nan>, %xmm1
    orps   %xmm0, %xmm1
Basically doing an abs(1.0f) in the two middle instructions.

We now generate:
    andps  <-0.000000e+00,0,0,0>, %xmm0
    orps   <1.000000e+00,0,0,0>, %xmm0

Builds on cleanups r223415, r223542.
rdar://19049548
Differential Revision: http://reviews.llvm.org/D6555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 17:33:03 +00:00
Charlie Turner
7fbbc81d65 [ARM] Add missing Tag_DIV_use tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225348 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 11:37:40 +00:00
Chandler Carruth
7372d445af [PM] Give slightly less horrible names to the utility pass templates for
requiring and invalidating specific analyses. Also make their printed
names match their class names. Writing these out as prose really doesn't
make sense to me any more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225346 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 11:14:51 +00:00
Karthik Bhat
f2b3638c3d Revert r225165 and r225169
Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case
where the original subtraction could have caused an overflow.
Reverting the same.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 06:34:34 +00:00
Chandler Carruth
9fc5a53118 [PM] Fix a pretty nasty bug where the new pass manager would invalidate
passes too many time.

I think this is actually the issue that someone raised with me at the
developer's meeting and in an email, but that we never really got to the
bottom of. Having all the testing utilities made it much easier to dig
down and uncover the core issue.

When a pass manager is running many passes over a single function, we
need it to invalidate the analyses between each run so that they can be
re-computed as needed. We also need to track the intersection of
preserved higher-level analyses across all the passes that we run (for
example, if there is one module analysis which all the function analyses
preserve, we want to track that and propagate it). Unfortunately, this
interacted poorly with any enclosing pass adaptor between two IR units.
It would see the intersection of preserved analyses, and need to
invalidate any other analyses, but some of the un-preserved analyses
might have already been invalidated *and recomputed*! We would fail to
propagate the fact that the analysis had already been invalidated.

The solution to this struck me as really strange at first, but the more
I thought about it, the more natural it seemed. After a nice discussion
with Duncan about it on IRC, it seemed even nicer. The idea is that
invalidating an analysis *causes* it to be preserved! Preserving the
lack of result is trivial. If it is recomputed, great. Until something
*else* invalidates it again, we're good.

The consequence of this is that the invalidate methods on the analysis
manager which operate over many passes now consume their
PreservedAnalyses object, update it to "preserve" every analysis pass to
which it delivers an invalidation (regardless of whether the pass
chooses to be removed, or handles the invalidation itself by updating
itself). Then we return this augmented set from the invalidate routine,
letting the pass manager take the result and use the intersection of
*that* across each pass run to compute the final preserved set. This
accounts for all the places where the early invalidation of an analysis
has already "preserved" it for a future run.

I've beefed up the testing and adjusted the assertions to show that we
no longer repeatedly invalidate or compute the analyses across nested
pass managers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225333 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-07 01:58:35 +00:00
Matt Arsenault
6a72b20325 R600/SI: Add combine for isinfinite pattern
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225310 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:46 +00:00
Matt Arsenault
42d9f7cf0a R600/SI: Pattern match isinf to v_cmp_class instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:41 +00:00
Matt Arsenault
a5b2b64292 R600/SI: Add basic DAG combines for fp_class
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225306 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:39 +00:00
Matt Arsenault
b6520ab625 R600/SI: Add class intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:37 +00:00
Matt Arsenault
374b57cec9 Fix using wrong intrinsic in test
This is a leftover from renaming the intrinsic.
It's surprising the unknown llvm. intrinsic wasn't rejected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225304 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 23:00:33 +00:00
Rafael Espindola
f907a26bc2 Change the .ll syntax for comdats and add a syntactic sugar.
In order to make comdats always explicit in the IR, we decided to make
the syntax a bit more compact for the case of a GlobalObject in a
comdat with the same name.

Just dropping the $name causes problems for

@foo = globabl i32 0, comdat
$bar = comdat ...

and

declare void @foo() comdat
$bar = comdat ...

So the syntax is changed to

@g1 = globabl i32 0, comdat($c1)
@g2 = globabl i32 0, comdat

and

declare void @foo() comdat($c1)
declare void @foo() comdat

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 22:55:16 +00:00
Hal Finkel
8e9ba0e588 [PowerPC] Reuse a load operand in int->fp conversions
int->fp conversions on PPC must be done through memory loads and stores. On a
modern core, this process begins by storing the int value to memory, then
loading it using a (sometimes special) FP load instruction. Unfortunately, we
would do this even when the value to be converted was itself a load, and we can
just use that same memory location instead of copying it to another first.
There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs,
because the fp_to_int operand has not been lowered when the int_to_fp is being
lowered. We handle this specially by invoking fp_to_int's lowering logic
(partially) and getting the necessary memory location (some trivial refactoring
was done to make this possible).

This is all somewhat ugly, and it would be nice if some later CodeGen stage
could just clean this stuff up, but because doing so would involve modifying
target-specific nodes (or instructions), it is not immediately clear how that
would work.

Also, remove a related entry from the README.txt for which we now generate
reasonable code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225301 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 22:31:02 +00:00
Colin LeMahieu
a602a7f199 [Hexagon] Adding compound jump encodings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225291 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 20:03:31 +00:00
Tom Stellard
bac89f3dd2 R600/SI: Insert s_waitcnt before s_barrier instructions.
This ensures that all memory operations are complete when all threads
reach the barrier.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 19:52:07 +00:00
Adrian Prantl
d2c42b9617 Revert "Reapply: Teach SROA how to update debug info for fragmented variables."
because of a tsan buildbot failure.
This reverts commit 225272.

Fix should be coming soon.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225288 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 19:47:27 +00:00
Colin LeMahieu
3d1d6d9043 [Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, dcfetch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225283 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 19:03:20 +00:00
Sanjoy Das
31123d4529 This patch teaches IndVarSimplify to add nuw and nsw to certain kinds
of operations that provably don't overflow. For example, we can prove
%civ.inc below does not sign-overflow. With this change,
IndVarSimplify changes %civ.inc to an add nsw.

  define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) {
   entry:
    %length = load i32* %length_ptr, !range !0
    %len.sub.1 = sub i32 %length, 1
    %upper = icmp slt i32 %init, %len.sub.1
    br i1 %upper, label %loop, label %exit
  
   loop:
    %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ]
    %civ.inc = add i32 %civ, 1
    %cmp = icmp slt i32 %civ.inc, %length
    br i1 %cmp, label %latch, label %break
  
   latch:
    store i32 0, i32* %array
    %check = icmp slt i32 %civ.inc, %len.sub.1
    br i1 %check, label %loop, label %break
  
   break:
    ret i32 %civ.inc
  
   exit:
    ret i32 42
  }

Differential Revision: http://reviews.llvm.org/D6748



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 19:02:56 +00:00
Colin LeMahieu
63d0449f11 [Hexagon] Adding encoding information for absolute address loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225279 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 18:38:26 +00:00
Tom Stellard
1f996fa36b R600/SI: Add a stub GCNTargetMachine
This is equivalent to the AMDGPUTargetMachine now, but it is the
starting point for separating R600 and GCN functionality into separate
targets.

It is recommened that users start using the gcn triple for GCN-based
GPUs, because using the r600 triple for these GPUs will be deprecated in
the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225277 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 18:00:21 +00:00
Andrea Di Biagio
e46783d5b7 [CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz.
This patch improves the logic added at revision 224899 (see review D6728) that
teaches the backend when it is profitable to speculate calls to cttz/ctlz.

The original algorithm conservatively avoided speculating more than one
instruction from a basic block in a control flow grap modelling an if-statement.
In particular, the only allowed instruction (excluding the terminator) was a
call to cttz/ctlz. However, there are cases where we could be less conservative
and still be able to speculate a call to cttz/ctlz.

With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the
result is zero extended/truncated in the same basic block, and the zext/trunc
instruction is "free" for the target.

Added new test cases to CodeGen/X86/cttz-ctlz.ll

Differential Revision: http://reviews.llvm.org/D6853


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225274 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 17:41:18 +00:00
Adrian Prantl
46cb54c0fb Reapply: Teach SROA how to update debug info for fragmented variables.
This also rolls in the changes discussed in http://reviews.llvm.org/D6766.
Defers migrating the debug info for new allocas until after all partitions
are created.

Thanks to Chandler for reviewing!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225272 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 17:14:10 +00:00
Filipe Cabecinhas
d682839830 Don't loop endlessly for MachO files with 0 ncmds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225271 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 17:08:26 +00:00
Hal Finkel
15914b5c22 [PowerPC] Add a regression test for r225251
In r225251, I removed an old entry from the README.txt file. While there are
several contributing factors (including pieces in Clang's ABI code), upon
further reflection, the backend part deserves a regression test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225268 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 16:46:37 +00:00
Colin LeMahieu
a24e012976 [Hexagon] Adding dealloc_return encoding and absolute address stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225267 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 16:15:15 +00:00
Matt Arsenault
d883ca0ca7 Convert fcmp with 0.0 from casted integers to icmp
This is already handled in general when it is known the
conversion can't lose bits with smaller integer types
casted into wider floating point types.

This pattern happens somewhat often in GPU programs that cast
workitem intrinsics to float, which are often compared with 0.

Specifically handle the special case of compares with zero which
should also be known to not lose information. I had a more general
version of this which allows equality compares if the casted float is
exactly representable in the integer, but I'm not 100% confident that
is always correct.

Also fold cases that aren't integers to true / false.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225265 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 15:50:59 +00:00
Chandler Carruth
0df89054c0 [PM] Introduce a utility pass that preserves no analyses.
Use this to test that path of invalidation. This test actually shows
redundant invalidation here that is really bad. I'm going to work on
fixing that next, but wanted to commit the test harness now that its all
working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225257 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 09:06:35 +00:00
Craig Topper
f6145affbf [X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16.
Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225256 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 08:59:30 +00:00
David Majnemer
51e4a66417 InstCombine: Bitcast call arguments from/to pointer/integer type
Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing
arguments and return values when DataLayout says it is safe to do so.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225254 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 08:41:31 +00:00
Chandler Carruth
6409bd68de [PM] Simplify how we parse the outer layer of the pass pipeline text and
remove an extra, redundant pass manager wrapping every run.

I had kept seeing these when manually testing, but it was getting really
annoying and was going to cause problems with overly eager invalidation.
The root cause was an overly complex and unnecessary pile of code for
parsing the outer layer of the pass pipeline. We can instead delegate
most of this to the recursive pipeline parsing.

I've added some somewhat more basic and precise tests to catch this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225253 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 08:37:58 +00:00
David Majnemer
b3065539bd X86: Don't make illegal GOTTPOFF relocations
"ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF
relocation target a movq or addq instruction.

Prohibit the truncation of such loads to movl or addl.

This fixes PR22083.

Differential Revision: http://reviews.llvm.org/D6839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225250 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 07:12:52 +00:00
Hal Finkel
10ae865847 [PowerPC] Improve int_to_fp(fp_to_int(x)) combining
The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x))
without a load/store pair is updated here with support for unsigned integers,
and to support single-precision values without a third rounding step, on newer
cores with the appropriate instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225248 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 06:01:57 +00:00
Chandler Carruth
17395fa733 [PM] Add a utility pass template that synthesizes the invalidation of
a specific analysis result.

This is quite handy to test things, and will also likely be very useful
for debugging issues. You could narrow down pass validation failures by
walking these invalidate pass runs up and down the pass pipeline, etc.
I've added support to the pass pipeline parsing to be able to create one
of these for any analysis pass desired.

Just adding this class uncovered one latent bug where the
AnalysisManager CRTP base class had a hard-coded Module type rather than
using IRUnitT.

I've also added tests for invalidation and caching of analyses in
a basic way across all the pass managers. These in turn uncovered two
more bugs where we failed to correctly invalidate an analysis -- its
results were invalidated but the key for re-running the pass was never
cleared and so it was never re-run. Quite nasty. I'm very glad to debug
this here rather than with a full system.

Also, yes, the naming here is horrid. I'm going to update some of the
names to be slightly less awful shortly. But really, I've no "good"
ideas for naming. I'll be satisfied if I can get it to "not bad".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225246 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 04:49:44 +00:00
Chandler Carruth
5b12a2f703 [PM] Add a collection of no-op analysis passes and switch the new pass
manager tests to use them and be significantly more comprehensive.

This, naturally, uncovered a bug where the CGSCC pass manager wasn't
printing analyses when they were run.

The only remaining core manipulator is I think an invalidate pass
similar to the require pass. That'll be next. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225240 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 02:50:06 +00:00
Chandler Carruth
a3376d2d36 [PM] Add a utility to the new pass manager for generating a pass which
is a no-op other than requiring some analysis results be available.

This can be used in real pass pipelines to force the usually lazy
analysis running to eagerly compute something at a specific point, and
it can be used to test the pass manager infrastructure (my primary use
at the moment).

I've also added bit of pipeline parsing magic to support generating
these directly from the opt command so that you can directly use these
when debugging your analysis. The syntax is:

  require<analysis-name>

This can be used at any level of the pass manager. For example:

  cgscc(function(require<my-analysis>,no-op-function))

This would produce a no-op function pass requiring my-analysis, followed
by a fully no-op function pass, both of these in a function pass manager
which is nested inside of a bottom-up CGSCC pass manager which is in the
top-level (implicit) module pass manager.

I have zero attachment to the particular syntax I'm using here. Consider
it a straw man for use while I'm testing and fleshing things out.
Suggestions for better syntax welcome, and I'll update everything based
on any consensus that develops.

I've used this new functionality to more directly test the analysis
printing rather than relying on the cgscc pass manager running an
analysis for me. This is still minimally tested because I need to have
analyses to run first! ;] That patch is next, but wanted to keep this
one separate for easier review and discussion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225236 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 02:10:51 +00:00
Rafael Espindola
5165dfdf9a Add a testcase that would have found the problem in r225048.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225235 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 01:41:24 +00:00
Lang Hames
bce877c84c Revert r225048: It broke ObjC on AArch64.
I've filed http://llvm.org/PR22100 to track this issue.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225228 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-06 00:54:32 +00:00
Hal Finkel
38c3e2f5c5 [PowerPC] Fix test to pass on Darwin hosts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225220 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 23:17:43 +00:00
Hal Finkel
fcfee17911 [PowerPC] Convert a README.txt entry into a better test
We now produce the desired code as noted in the README.txt file (no spurious
or). Remove the README entry and improve the regression test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225214 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:53:52 +00:00
Colin LeMahieu
e4f1dcdb83 [Hexagon] Adding add/sub with carry, logical shift left by immediate and memop instructions. Removing old defs without bits and updating references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225210 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:36:38 +00:00
Hal Finkel
1b84bf2554 [PowerPC] Add a test for truncating a shifted load
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225209 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:33:14 +00:00
Frederic Riss
5a0743e1e8 [dsymutil] Implement the BinaryHolder object and gain archive support.
This object is meant to own the ObjectFiles and their underlying
MemoryBuffer. It is basically the equivalent of an OwningBinary
except that it efficiently handles Archives. It is optimized for
efficiently providing mappings of members of the same archive when
they are opened successively (which is standard in Darwin debug
maps, objects from the same archive will be contiguous).

Of course, the BinaryHolder will also be used by the DWARF linker
once it is commited, but for now only the debug map parser uses it.

With this change, you can run llvm-dsymutil on your Darwin debug build
of clang and get a complete debug map for it.

Differential Revision: http://reviews.llvm.org/D6690

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225207 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:29:28 +00:00
Hal Finkel
e7d845b709 [PowerPC] Add another test for load/store with update
We now produce the desired code as noted in the README.txt file. Remove the
README entry and add a regression test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225205 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:22:42 +00:00
Hal Finkel
ccc83e4a08 [PowerPC] Fold i1 extensions with other ops
Consider this function from our README.txt file:

  int foo(int a, int b) { return (a < b) << 4; }

We now explicitly track CR bits by default, so the comment in the README.txt
about not really having a SETCC is no longer accurate, but we did generate this
somewhat silly code:

        cmpw 0, 3, 4
        li 3, 0
        li 12, 1
        isel 3, 12, 3, 0
        sldi 3, 3, 4
        blr

which generates the zext as a select between 0 and 1, and then shifts the
result by a constant amount. Here we preprocess the DAG in order to fold the
results of operations on an extension of an i1 value into the SELECT_I[48]
pseudo instruction when the resulting constant can be materialized using one
instruction (just like the 0 and 1). This was not implemented as a DAGCombine
because the resulting code would have been anti-canonical and depends on
replacing chained user nodes, which does not fit well into the lowering
paradigm. Now we generate:

        cmpw 0, 3, 4
        li 3, 0
        li 12, 16
        isel 3, 12, 3, 0
        blr

which is less silly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225203 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 21:10:24 +00:00
Colin LeMahieu
ca96263b05 [Hexagon] Adding rounding reg/reg variants, accumulating multiplies, and accumulating shifts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225201 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 20:56:41 +00:00
Colin LeMahieu
27494b0633 [Hexagon] Adding V4 bit manipulating instructions, removing ALU defs without encoding bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225199 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 20:35:54 +00:00
Colin LeMahieu
c8e734a561 [Hexagon] Adding V4 logic-logic instructions and tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225198 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 20:14:58 +00:00
Colin LeMahieu
e48ec2a918 [Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225197 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 20:04:40 +00:00
Hal Finkel
3ab10c1918 [PowerPC] Remove zexts after i32 ctlz
The 64-bit semantics of cntlzw are not special, the 32-bit population count is
stored as a 64-bit value in the range [0,32]. As a result, it is always zero
extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a
frontier instruction for the removal of unnecessary zero extensions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225192 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 18:52:29 +00:00
Hal Finkel
0ef99720c5 [PowerPC] Remove zexts after byte-swapping loads
lhbrx and lwbrx not only load their data with byte swapping, but also clear the
upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG
peephole optimization as frontier instructions for the removal of unnecessary
zero extensions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225189 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 18:09:06 +00:00
Colin LeMahieu
9e989cf190 [Hexagon] Adding round reg/imm and bitsplit instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225188 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 18:08:21 +00:00
Ahmed Bougacha
3c9fb6e1ad [AArch64] Improve codegen of store lane instructions by avoiding GPR usage.
We used to generate code similar to:

  umov.b        w8, v0[2]
  strb  w8, [x0, x1]

because the STR*ro* patterns were preferred to ST1*.
Instead, we can avoid going through GPRs, and generate:

  add   x8, x0, x1
  st1.b { v0 }[2], [x8]

This patch increases the ST1* AddedComplexity to achieve that.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6202


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 17:10:26 +00:00
Ahmed Bougacha
c52cd839b9 [AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister.
For 0-lane stores, we used to generate code similar to:

  fmov w8, s0
  str w8, [x0, x1, lsl #2]

instead of:

  str s0, [x0, x1, lsl #2]

To correct that: for store lane 0 patterns, directly match to STR <subreg>0.

Byte-sized instructions don't have the special case for a 0 index,
because FPR8s are defined to have untyped content.

rdar://16372710
Differential Revision: http://reviews.llvm.org/D6772


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 17:02:28 +00:00
NAKAMURA Takumi
19d9f342ed llvm/test/lit.cfg: have_ld_plugin_support(): Use decode() for stdout.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225171 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 14:18:04 +00:00
Karthik Bhat
050064d32c Select lower fsub,fabs pattern to fabd on AArch64
This patch lowers patterns such as-
  fsub   v0.4s, v0.4s, v1.4s
  fabs   v0.4s, v0.4s
to
  fabd  v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6791



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225169 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:57:59 +00:00
Charlie Turner
6abfc44aab Parse Tag_compatibility correctly.
Tag_compatibility takes two arguments, but before this patch it would
erroneously accept just one, it now produces an error in that case.

Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225167 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:26:37 +00:00
Charlie Turner
b99b8ffb7f Emit the build attribute Tag_conformance.
Claim conformance to version 2.09 of the ARM ABI.

This build attribute must be emitted first amongst the build attributes when
written to an object file. This is to simplify conformance detection by
consumers.

Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225166 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:12:17 +00:00
Karthik Bhat
e239724d12 Select lower sub,abs pattern to sabd on AArch64
This patch lowers patterns such as-
  sub	v0.4s, v0.4s, v1.4s
  abs	v0.4s, v0.4s
to
  sabd	v0.4s, v0.4s, v1.4s
on AArch64.

Review: http://reviews.llvm.org/D6781



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 13:11:07 +00:00
Michael Kuperstein
25903ef9bc Fix broken test from r225159.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 12:34:01 +00:00
Chandler Carruth
1ab487fdc7 [PM] Don't run the machinery of invalidating all the analysis passes
when all are being preserved.

We want to short-circuit this for a couple of reasons. One, I don't
really want passes to grow a dependency on actually receiving their
invalidate call when they've been preserved. I'm thinking about removing
this entirely. But more importantly, preserving everything is likely to
be the common case in a lot of scenarios, and it would be really good to
bypass all of the invalidation and preservation machinery there.
Avoiding calling N opaque functions to try to invalidate things that are
by definition still valid seems important. =]

This wasn't really inpsired by much other than seeing the spam in the
logging for analyses, but it seems better ot get it checked in rather
than forgetting about it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225163 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 12:32:11 +00:00
Chandler Carruth
040ca449b2 [PM] Add names and debug logging for analysis passes to the new pass
manager.

This starts to allow us to test analyses more easily, but it's really
only the beginning. Some of the code here is still untestable without
manual changes to create analysis passes, but I wanted to factor it into
a small of chunks as possible.

Next up in order to be able to test things are, in no particular order:
- No-op analyses passes so we don't have to use real ones to exercise
  the pass maneger itself.
- Automatic way of generating dummy passes that require an analysis be
  run, including a variant that calls a 'print' method on a pass to make
  it even easier to print out the results of an analysis.
- Dummy passes that invalidate all analyses for their IR unit so we can
  test invalidation and re-runs.
- Automatic way to print each analysis pass as it is re-run.
- Automatic but optional verification of analysis passes everywhere
  possible.

I'm not claiming I'll get to all of these immediately, but that's what
is in the pipeline at some stage. I'm fleshing out exactly what I need
and what to prioritize by working on converting analyses and then trying
to test the conversion. =]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225162 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 12:21:44 +00:00
Jiangning Liu
614fe873ce Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm.
{code}
// loop body
   ... = a[i]          (1)
    ... = a[i+1]       (2)
 .......
a[i+1] = ....          (3)
   a[i] = ...          (4)
{code}

The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed.

For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized.

The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225159 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 10:08:58 +00:00
Hal Finkel
0ef8f3189e [PowerPC] Enable speculation of cttz/ctlz
PPC has an instruction for ctlz with defined zero behavior, and our lowering of
cttz (provided by DAGCombine) is also efficient and branchless, so speculating
these makes sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225150 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 05:24:42 +00:00
Chandler Carruth
4f9a7277d1 [SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an
assert out of the new pre-splitting in SROA.

This fix makes the code do what was originally intended -- when we have
a store of a load both dealing in the same alloca, we force them to both
be pre-split with identical offsets. This is really quite hard to do
because we can keep discovering problems as we go along. We have to
track every load over the current alloca which for any resaon becomes
invalid for pre-splitting, and go back to remove all stores of those
loads. I've included a couple of test cases derived from PR22093 that
cover the different ways this can happen. While that PR only really
triggered the first of these two, its the same fundamental issue.

The other challenge here is documented in a FIXME now. We end up being
quite a bit more aggressive for pre-splitting when loads and stores
don't refer to the same alloca. This aggressiveness comes at the cost of
introducing potentially redundant loads. It isn't clear that this is the
right balance. It might be considerably better to require that we only
do pre-splitting when we can presplit every load and store involved in
the entire operation. That would give more consistent if conservative
results. Unfortunately, it requires a non-trivial change to the actual
pre-splitting operation in order to correctly handle cases where we end
up pre-splitting stores out-of-order. And it isn't 100% clear that this
is the right direction, although I'm starting to suspect that it is.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225149 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 04:17:53 +00:00
Hal Finkel
9cad6c8a24 [PowerPC] Materialize i64 constants using rotation with masking
r225135 added the ability to materialize i64 constants using rotations in order
to reduce the instruction count. Sometimes we can use a rotation only with some
extra masking, so that we take advantage of the fact that generating a bunch of
extra higher-order 1 bits is easy using li/lis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225147 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 03:41:38 +00:00
Chandler Carruth
51fa09d980 [PM] Wire up support for explicitly running the verifier pass.
The required functionality has been there for some time, but I never
managed to actually wire it into the command line registry of passes.
Let's do that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225144 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-05 00:08:53 +00:00
Simon Pilgrim
c0c36083da [X86][SSE] Added vector packing test for pr12412
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225138 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 19:08:03 +00:00
Simon Pilgrim
dc18ec0e0d [X86][SSE] Added vector integer truncation tests - based off pr15524
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 17:52:00 +00:00
Hal Finkel
2ac0826af3 [PowerPC] Materialize i64 constants using rotation
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing a rotated constant, and then applying the inverse rotation, requires
fewer instructions than the direct method. If so, do that instead.

In r225132, I added support for forming constants using bit inversion. In
effect, this reverts that commit and replaces it with rotation support. The bit
inversion is useful for turning constants that are mostly ones into ones that
are mostly zeros (thus enabling a more-efficient shift-based materialization),
but the same effect can be obtained by using negative constants and a rotate,
and that is at least as efficient, if not more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225135 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 15:43:55 +00:00
Hal Finkel
d138a7bb3f [PowerPC] Materialize i64 constants using bit inversion
Materializing full 64-bit constants on PPC64 can be expensive, requiring up to
5 instructions depending on the locations of the non-zero bits. Sometimes
materializing the bit-reversed constant, and then flipping the bits, requires
fewer instructions than the direct method. If so, do that instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225132 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 12:35:03 +00:00
David Majnemer
07d7dbae9e InstCombine: match can find ConstantExprs, don't assume we have a Value
We assumed the output of a match was a Value, this would cause us to
assert because we would fail a cast<>.  Instead, use a helper in the
Operator family to hide the distinction between Value and Constant.

This fixes PR22087.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 07:36:02 +00:00
David Majnemer
77e22b7836 ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes
PHI nodes can have zero operands in the middle of a transform.  It is
expected that utilities in Analysis don't freak out when this happens.

Note that it is considered invalid to allow these misshapen phi nodes to
make it to another pass.

This fixes PR22086.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225126 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-04 07:06:53 +00:00
Saleem Abdulrasool
b19a485253 llvm-readobj: add support to dump COFF export tables
This enhances llvm-readobj to print out the COFF export table, similar to the
-coff-import option.  This is useful for testing in lld.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 21:35:09 +00:00
Saleem Abdulrasool
97f8f69a7f ARM: permit tail calls to weak externals on COFF
Weak externals are resolved statically, so we can actually generate the tail
call on PE/COFF targets without breaking the requirements.  It is questionable
whether we want to propagate the current behaviour for MachO as the requirements
are part of the ARM ELF specifications, and it seems that prior to the SVN
r215890, we would have tail'ed the call.  For now, be conservative and only
permit it on PE/COFF where the call will always be fully resolved.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225119 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 21:35:00 +00:00
Hal Finkel
e05b232c20 [PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference
The existing code provided for specifying a global loop alignment preference.
However, the preferred loop alignment might depend on the loop itself. For
recent POWER cores, loops between 5 and 8 instructions should have 32-byte
alignment (while the others are better with 16-byte alignment) so that the
entire loop will fit in one i-cache line.

To support this, getPrefLoopAlignment has been made virtual, and can be
provided with an optional MachineLoop* so the target can inspect the loop
before answering the query. The default behavior, as before, is to return the
value set with setPrefLoopAlignment. MachineBlockPlacement now queries the
target for each loop instead of only once per function. There should be no
functional change for other targets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 17:58:24 +00:00
Hal Finkel
a1d22cc789 [PowerPC] Use 16-byte alignment for modern cores for functions/loops
Most modern PowerPC cores prefer that functions and loops start on
16-byte-aligned boundaries (*), so instruct block placement, etc. to make this
happen. The branch selector has also been adjusted so account for the extra
nops that might now be inserted before loop headers.

(*) Some cores actually prefer other alignments for small loops, but that will
    be addressed in a follow-up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225115 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 14:58:25 +00:00
Hal Finkel
958b670c34 [PowerPC] Add support for the CMPB instruction
Newer POWER cores, and the A2, support the cmpb instruction. This instruction
compares its operands, treating each of the 8 bytes in the GPRs separately,
returning a 'mask' result of 0 (for false) or -1 (for true) in each byte.

Code generation support is added, in the form of a PPCISelDAGToDAG
DAG-preprocessing routine, that recognizes patterns close to what the
instruction computes (either exactly, or related by a constant masking
operation), and generates the cmpb instruction (along with any necessary
constant masking operation). This can be expanded if use cases arise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225106 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 01:16:37 +00:00
Kostya Serebryany
8c6ae1044a [asan] simplify the tracing code, make it use the same guard variables as coverage
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225103 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 00:54:43 +00:00
Craig Topper
01c99892ca [X86] Disassembler support for move to/from %rax with a 32-bit memory offset is REX.W and AdSize prefix are both present.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-03 00:00:20 +00:00
David Majnemer
5e9c6212a8 InstCombine: Detect when llvm.umul.with.overflow always overflows
We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero
and LHSKnownOne * RHSKnownOne overflow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225077 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 07:29:47 +00:00
Craig Topper
71fc42dbf6 [X86] Make the instructions that use AdSize16/32/64 co-exist together without using mode predicates.
This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used.

Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 07:02:25 +00:00
Chandler Carruth
ce7f347da2 [SROA] Teach SROA to be more aggressive in splitting now that we have
a pre-splitting pass over loads and stores.

Historically, splitting could cause enough problems that I hamstrung the
entire process with a requirement that splittable integer loads and
stores must cover the entire alloca. All smaller loads and stores were
unsplittable to prevent chaos from ensuing. With the new pre-splitting
logic that does load/store pair splitting I introduced in r225061, we
can now very nicely handle arbitrarily splittable loads and stores. In
order to fully benefit from these smarts, we need to mark all of the
integer loads and stores as splittable.

However, we don't actually want to rewrite partitions with all integer
loads and stores marked as splittable. This will fail to extract scalar
integers from aggregates, which is kind of the point of SROA. =] In
order to resolve this, what we really want to do is only do
pre-splitting on the alloca slices with integer loads and stores fully
splittable. This allows us to uncover all non-integer uses of the alloca
that would benefit from a split in an integer load or store (and where
introducing the split is safe because it is just memory transfer from
a load to a store). Once done, we make all the non-whole-alloca integer
loads and stores unsplittable just as they have historically been,
repartition and rewrite.

The result is that when there are integer loads and stores anywhere
within an alloca (such as from a memcpy of a sub-object of a larger
object), we can split them up if there are non-integer components to the
aggregate hiding beneath. I've added the challenging test cases to
demonstrate how this is able to promote to scalars even a case where we
have even *partially* overlapping loads and stores.

This restores the single-store behavior for small arrays of i8s which is
really nice. I've restored both the little endian testing and big endian
testing for these exactly as they were prior to r225061. It also forced
me to be more aggressive in an alignment test to actually defeat SROA.
=] Without the added volatiles there, we actually split up the weird i16
loads and produce nice double allocas with better alignment.

This also uncovered a number of bugs where we failed to handle
splittable load and store slices which didn't have a begininng offset of
zero. Those fixes are included, and without them the existing test cases
explode in glorious fireworks. =]

I've kept support for leaving whole-alloca integer loads and stores as
splittable even for the purpose of rewriting, but I think that's likely
no longer needed. With the new pre-splitting, we might be able to remove
all the splitting support for loads and stores from the rewriter. Not
doing that in this patch to try to isolate any performance regressions
that causes in an easy to find and revert chunk.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 03:55:54 +00:00
Chandler Carruth
40a8741994 [SROA] Add a test case for r225068 / PR22080.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225070 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-02 00:34:29 +00:00
Chandler Carruth
450b39e971 [SROA] Teach SROA how to much more intelligently handle split loads and
stores.

When there are accesses to an entire alloca with an integer
load or store as well as accesses to small pieces of the alloca, SROA
splits up the large integer accesses. In order to do that, it uses bit
math to merge the small accesses into large integers. While this is
effective, it produces insane IR that can cause significant problems in
the rest of the optimizer:

- It can cause load and store mismatches with GVN on the non-alloca side
  where we end up loading an i64 (or some such) rather than loading
  specific elements that are stored.
- We can't always get rid of the integer bit math, which is why we can't
  always fix the loads and stores to work well with GVN.
- This is especially bad when we have operations that mix poorly with
  integer bit math such as floating point operations.
- It will block things like the vectorizer which might be able to handle
  the scalar stores that underly the aggregate.

At the same time, we can't just directly split up these loads and stores
in all cases. If there is actual integer arithmetic involved on the
values, then using integer bit math is actually the perfect lowering
because we can often combine it heavily with the surrounding math.

The solution this patch provides is to find places where SROA is
partitioning aggregates into small elements, and look for splittable
loads and stores that it can split all the way to some other adjacent
load and store. These are uniformly the cases where failing to split the
loads and stores hurts the optimizer that I have seen, and I've looked
extensively at the code produced both from more and less aggressive
approaches to this problem.

However, it is quite tricky to actually do this in SROA. We may have
loads and stores to the same alloca, or other complex patterns that are
hard to handle. This complexity leads to the somewhat subtle algorithm
implemented here. We have to do this entire process as a separate pass
over the partitioning of the alloca, and split up all of the loads prior
to splitting the stores so that we can handle safely the cases of
overlapping, including partially overlapping, loads and stores to the
same alloca. We also have to reconstitute the post-split slice
configuration so we can avoid iterating again over all the alloca uses
(the slow part of SROA). But we also have to ensure that when we split
up loads and stores to *other* allocas, we *do* re-iterate over them in
SROA to adapt to the more refined partitioning now required.

With this, I actually think we can fix a long-standing TODO in SROA
where I avoided splitting as many loads and stores as probably should be
splittable. This limitation historically mitigated the fallout of all
the bad things mentioned above. Now that we have more intelligent
handling, I plan to remove the FIXME and more aggressively mark integer
loads and stores as splittable. I'll do that in a follow-up patch to
help with bisecting any fallout.

The net result of this change should be more fine-grained and accurate
scalars being formed out of aggregates. At the very least, Clang now
generates perfect code for this high-level test case using
std::complex<float>:

  #include <complex>

  void g1(std::complex<float> &x, float a, float b) {
    x += std::complex<float>(a, b);
  }
  void g2(std::complex<float> &x, float a, float b) {
    x -= std::complex<float>(a, b);
  }

  void foo(const std::complex<float> &x, float a, float b,
           std::complex<float> &x1, std::complex<float> &x2) {
    std::complex<float> l1 = x;
    g1(l1, a, b);
    std::complex<float> l2 = x;
    g2(l2, a, b);
    x1 = l1;
    x2 = l2;
  }

This code isn't just hypothetical either. It was reduced out of the hot
inner loops of essentially every part of the Eigen math library when
using std::complex<float>. Those loops would consistently and
pervasively hop between the floating point unit and the integer unit due
to bit math extraction and insertion of floating point values that were
"stored" in a 64-bit integer register around the loop backedge.

So far, this change has passed a bootstrap and I have done some other
testing and so far, no issues. That doesn't mean there won't be though,
so I'll be prepared to help with any fallout. If you performance swings
in particular, please let me know. I'm very curious what all the impact
of this change will be. Stay tuned for the follow-up to also split more
integer loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-01 11:54:38 +00:00
Hal Finkel
84cd524ee9 [PowerPC] Improve instruction selection bit-permuting operations (64-bit)
This is the second installment of improvements to instruction selection for "bit
permutation" instruction sequences. r224318 added logic for instruction
selection for 32-bit bit permutation sequences, and this adds lowering for
64-bit sequences. The 64-bit sequences are more complicated than the 32-bit
ones because:
  a) the 64-bit versions of the 32-bit rotate-and-mask instructions
     work by replicating the lower 32-bits of the value-to-be-rotated into the
     upper 32 bits -- and integrating this into the cost modeling for the various
     bit group operations is non-trivial
  b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions
     cannot, in one instruction, specify the
     mask starting index, the mask ending index, and the rotation factor. Also,
     forming arbitrary 64-bit constants is more complicated than in 32-bit mode
     because the number of instructions necessary is value dependent.

Plus, support for 'late masking' was added: it is sometimes more efficient to
treat the overall value as if it had no mandatory zero bits when planning the
bit-group insertions, and then mask them in at the very end. Unfortunately, as
the structure of the bit groups is different in the two cases, the more
feasible implementation technique was to generate both instruction sequences,
and then pick the shorter one.

And finally, we now generate reasonable code for i64 bswap:

        rldicl 5, 3, 16, 0
        rldicl 4, 3, 8, 0
        rldicl 6, 3, 24, 0
        rldimi 4, 5, 8, 48
        rldicl 5, 3, 32, 0
        rldimi 4, 6, 16, 40
        rldicl 6, 3, 48, 0
        rldimi 4, 5, 24, 32
        rldicl 5, 3, 56, 0
        rldimi 4, 6, 40, 16
        rldimi 4, 5, 48, 8
        rldimi 4, 3, 56, 0

vs. what we used to produce:

        li 4, 255
        rldicl 5, 3, 24, 40
        rldicl 6, 3, 40, 24
        rldicl 7, 3, 56, 8
        sldi 8, 3, 8
        sldi 10, 3, 24
        sldi 12, 3, 40
        rldicl 0, 3, 8, 56
        sldi 9, 4, 32
        sldi 11, 4, 40
        sldi 4, 4, 48
        andi. 5, 5, 65280
        andis. 6, 6, 255
        andis. 7, 7, 65280
        sldi 3, 3, 56
        and 8, 8, 9
        and 4, 12, 4
        and 9, 10, 11
        or 6, 7, 6
        or 5, 5, 0
        or 3, 3, 4
        or 7, 9, 8
        or 4, 6, 5
        or 3, 3, 7
        or 3, 3, 4

which is 12 instructions, instead of 25, and seems optimal (at least in terms
of code size).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225056 91177308-0d34-0410-b5e6-96231b3b80d8
2015-01-01 02:53:29 +00:00
Sanjay Patel
28650b8ec2 InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X
Some day the backend may handle instruction-level fast math flags and make
this transform unnecessary, but it's still better practice to use the canonical
representation of fneg when possible (use a -0.0).

This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ).
See also http://reviews.llvm.org/D6723.

Differential Revision: http://reviews.llvm.org/D6731



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225050 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 22:14:05 +00:00
Rafael Espindola
8093abb745 Add r224985 back with a fix.
The issues was that AArch64 has additional restrictions on when local
relocations can be used. We have to take those into consideration when
deciding to put a L symbol in the symbol table or not.

Original message:

Remove doesSectionRequireSymbols.

In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225048 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 17:19:34 +00:00
Colin LeMahieu
f5943bd9a9 Reverting 225045 and 225043 and XFAIL multiline.ll on hexagon
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225047 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 17:14:35 +00:00
Rafael Espindola
85419e7e3b Add a test for the recent compiler-rt build failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225046 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 16:58:05 +00:00
Rafael Espindola
937e781f49 Revert "Remove doesSectionRequireSymbols."
This reverts commit r224985.

I am investigating why it made an Apple bot unhappy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225044 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 16:06:48 +00:00
Craig Topper
c602b726a8 [X86] Update disassembler tests for absolute move instructions to check the encodings. This provides testing for r225036. 64-bit mode is still broken.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225037 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 07:24:23 +00:00
David Majnemer
0f77ccd6bb InstCombine: try to transform A-B < 0 into A < B
We are allowed to move the 'B' to the right hand side if we an prove
there is no signed overflow and if the comparison itself is signed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225034 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 04:21:41 +00:00
Alexey Samsonov
c0319dd9c2 Revert "merge consecutive stores of extracted vector elements"
This reverts commit r224611. This change causes crashes
in X86 DAG->DAG Instruction Selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225031 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 00:40:28 +00:00
Colin LeMahieu
96c631b191 [Hexagon] Adding accumulating add/sub, doubleword logic-not variants, doubleword bitfield extract, word parity, accumulating multiplies with saturation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225024 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-31 00:08:34 +00:00
David Blaikie
855324b9de Fix a test case to not depend on asm comment syntax, so as to be portable
Too many different comment characters - instead of trying to account for
them all, instead disable the comments and just check for end-of-line
instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225020 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 23:33:55 +00:00
David Blaikie
6fcdb2681c Generalize even further, for ARM comment syntax (@)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 23:23:58 +00:00
Colin LeMahieu
cb5c5f5934 [Hexagon] Adding double-logic on predicate instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225018 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 23:22:39 +00:00
David Blaikie
2e05c34ba6 Generalize test case to handle different asm syntax (# or // comments)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225017 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 23:21:57 +00:00
Colin LeMahieu
6026119d9f [Hexagon] Adding newvalue compare and jumps.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225015 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 23:04:21 +00:00
David Blaikie
1d68fc5021 DebugInfo: Omit is_stmt from line table entries on the same line.
GCC does this for non-zero discriminators and since GCC doesn't produce
column info, that was the only place it comes up there. For LLVM, since
we can emit discriminators and/or column info, it makes more sense to
invert the condition and just test for changes in line number.

This should resolve at least some of the GDB 7.5 test suite failures
created by recent Clang changes that increase the location fidelity
(which, since Clang defaults to including column info on Linux by
default created a bunch of cases that confused GDB).

In theory we could do this better/differently by grouping actual source
statements together in a similar manner to the way lexical scopes are
handled but given that GDB isn't really in a position to consume that (&
users are probably somewhat used to different lines being different
'statements') this seems the safest and cheapest change. (I'm concerned
that doing this 'right' would bloat the debugloc data even further -
something Duncan's working hard to address)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225011 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:47:13 +00:00
Colin LeMahieu
a7940ef0e4 [Hexagon] Adding postincrement register newvalue stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225010 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:34:08 +00:00
Colin LeMahieu
df2531486d [Hexagon] Removing old newvalue store variants. Adding postincrement immediate newvalue stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:28:31 +00:00
Zoran Jovanovic
25547ee83c [mips][microMIPS] Relocate with symbol for micromips symbols
Differential Revision: http://reviews.llvm.org/D6796


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225008 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:04:16 +00:00
Colin LeMahieu
ab63a4c95e [Hexagon] Adding indexed store new-value variants.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225007 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 22:00:26 +00:00
Colin LeMahieu
3fa758981d [Hexagon] Adding indexed store of immediates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 21:01:38 +00:00
Colin LeMahieu
65971bbfd7 [Hexagon] Adding indexed stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225005 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 20:42:23 +00:00
Peter Collingbourne
d8ae3e1fee x86_64: Fix calls to __morestack under the large code model.
Under the large code model, we cannot assume that __morestack lives within
2^31 bytes of the call site, so we cannot use pc-relative addressing. We
cannot perform the call via a temporary register, as the rax register may
be used to store the static chain, and all other suitable registers may be
either callee-save or used for parameter passing. We cannot use the stack
at this point either because __morestack manipulates the stack directly.

To avoid these issues, perform an indirect call via a read-only memory
location containing the address.

This solution is not perfect, as it assumes that the .rodata section
is laid out within 2^31 bytes of each function body, but this seems to
be sufficient for JIT.

Differential Revision: http://reviews.llvm.org/D6787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225003 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 20:05:19 +00:00
Kostya Serebryany
dd890d5c5e [asan] change _sanitizer_cov_module_init to accept int* instead of int**
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 19:29:28 +00:00
Michael Kuperstein
08c26613e1 [COFF] Don't try to add quotes to already quoted linker directives
If a linker directive is already quoted, don't try to quote it again, otherwise it creates a mess.
This pops up in places like:
#pragma comment(linker,"\"/foo bar'\"")

Differential Revision: http://reviews.llvm.org/D6792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224998 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 19:23:48 +00:00
Colin LeMahieu
88e5659aaf [Hexagon] Adding reg-reg indexed load forms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224997 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 18:58:47 +00:00
Colin LeMahieu
066f43435a [Hexagon] Adding compare byte/halfword reg-reg/reg-imm forms. Adding compare to general register reg-imm form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 17:39:24 +00:00
Colin LeMahieu
af9e1c79a5 [Hexagon] Updating constant extender def, adding alu-not instructions, compare to general register, and inverted compares.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 15:44:17 +00:00
Rafael Espindola
65300b95e6 Remove doesSectionRequireSymbols.
In an assembly expression like

bar:
.long L0 + 1

the intended semantics is that bar will contain a pointer one byte past L0.

In sections that are merged by content (strings, 4 byte constants, etc), a
single position in the section doesn't give the linker enough information.
For example, it would not be able to tell a relocation must point to the
end of a string, since that would look just like the start of the next.

The solution used in ELF to use relocation with symbols if there is a non-zero
addend.

In MachO before this patch we would just keep all symbols in some sections.

This would miss some cases (only cstrings on x86_64 were implemented) and was
inefficient since most relocations have an addend of 0 and can be represented
without the symbol.

This patch implements the non-zero addend logic for MachO too.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 13:13:27 +00:00
Rafael Espindola
d5dd993855 Simplify test a bit.
It looks like the original intent was to check which symbols were created.
With macho-dump the sections were being checked just to match which symbol
was in which section.

llvm-objdump prints the section a symbol is in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 05:09:17 +00:00
Peter Zotov
21a0fa44e1 [OCaml] Fix bitrot in tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224979 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 03:24:14 +00:00
Peter Zotov
91bf887d6d [lit] Make config.llvm_lib_dir available on cmake, too.
The OCaml tests require config.llvm_lib_dir to determine
the OCaml package search path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224978 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 03:24:11 +00:00
Craig Topper
f8207ac705 Testcases for r224939.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224976 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 02:35:56 +00:00
Rafael Espindola
dbeada5a92 Convert test to llvm-readobj. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224973 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-30 01:34:06 +00:00
Philip Reames
35f43b8786 Semantic tests for memory invalidation at statepoints
These are simply a collection of tests intended to show that information about the contents of gc references in the heap is lost at a statepoint. I've tried to write them so that they don't disallow correct transformations, while still being fairly easy to understand.

p.s. Ideas for additional tests are welcome.

Differential Revision: http://reviews.llvm.org/D6491



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224971 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 23:55:33 +00:00
Philip Reames
91a083c57f Carry facts about nullness and undef across GC relocation
This change implements four basic optimizations:

    If a relocated value isn't used, it doesn't need to be relocated.
    If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.)
    If the value being relocated is undef, the relocation is meaningless.
    If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.)

I outlined other planned work in comments.

Differential Revision: http://reviews.llvm.org/D6600



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224968 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 23:27:30 +00:00
Philip Reames
1714ad67bd Refine the notion of MayThrow in LICM to include a header specific version
In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up.

This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs.

define void @nothrow_header(i64 %x, i64 %y, i1 %cond) {
; CHECK-LABEL: nothrow_header
; CHECK-LABEL: entry
; CHECK: %div = udiv i64 %x, %y
; CHECK-LABEL: loop
; CHECK: call void @use(i64 %div)
entry:
  br label %loop
loop: ; preds = %entry, %for.inc
  %div = udiv i64 %x, %y
  br i1 %cond, label %loop-if, label %exit
loop-if:
  call void @use(i64 %div)
  br label %loop
exit:
  ret void
}

The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load.  The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata.

Differential Revision: http://reviews.llvm.org/D6725



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224965 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 23:00:57 +00:00
Philip Reames
456b7b602c Loading from null is valid outside of addrspace 0
This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen.  This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces.

We really should introduce a hook to control this property on a per target per address space basis.  We may be loosing valuable optimizations in some address spaces by being too conservative.

Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224961 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 22:46:21 +00:00
Rafael Espindola
02d187cdb9 Convert test to llvm-readobj. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224959 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 22:14:35 +00:00
Colin LeMahieu
7c58cad0ca [Hexagon] Adding allocframe, post-increment circular immediate stores, post-increment circular register stores, and bit reversed post-increment stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224957 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 21:33:45 +00:00
Colin LeMahieu
0bd2ffae08 [Hexagon] Adding post-increment register form stores and register-immediate form stores with tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224952 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 20:44:51 +00:00
Colin LeMahieu
3dc54ee5a4 [Hexagon] Replacing the remaining postincrement stores with versions that have encoding bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224951 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 20:00:43 +00:00
Rafael Espindola
c1c55ba767 Convert test to FileCheck. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224950 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 19:50:32 +00:00
Colin LeMahieu
d25cfdb649 [Hexagon] Renaming old multiclass for removal. Adding post-increment store classes and instruction defs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224949 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 19:42:14 +00:00
Rafael Espindola
a21d820952 Add segmented stack support for DragonFlyBSD.
Patch by Michael Neumann.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224936 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-29 15:47:28 +00:00
NAKAMURA Takumi
ff95215754 llvm/test/CodeGen/X86/fast-isel-call-bool.ll: Add explicit -mtriple=x86_64-unknown to satisfy x64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224907 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 23:37:11 +00:00
Keno Fischer
41bda9f201 [X86][ISel] Fix a regression I introduced in r224884
The else case ResultReg was not checked for validity.
To my surprise, this case was not hit in any of the
existing test cases. This includes a new test cases
that tests this path.

Also drop the `target triple` declaration from the
original test as suggested by H.J. Lu, because
apparently with it the test won't be run on Linux

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224901 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 15:20:57 +00:00
Michael Kuperstein
bfa4a373f4 [X86] Add missing memory variants to AVX false dependency breaking
Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246)

Differential Revision: http://reviews.llvm.org/D6780

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224900 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 13:15:05 +00:00
Andrea Di Biagio
70a7cda495 [CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz.
If the control flow is modelling an if-statement where the only instruction in
the 'then' basic block (excluding the terminator) is a call to cttz/ctlz,
CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control
flow graph.

Example:
\code
entry:
  %cmp = icmp eq i64 %val, 0
  br i1 %cmp, label %end.bb, label %then.bb

then.bb:
  %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true)
  br label %end.bb

end.bb:
  %cond = phi i64 [ %c, %then.bb ], [ 64, %entry]
\code

In this example, basic block %then.bb is taken if value %val is not zero.
Also, the phi node in %end.bb would propagate the size-of in bits of %val
only if %val is equal to zero.

With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb
into basic block %entry only if cttz is cheap to speculate for the target.

Added two new hooks in TargetLowering.h to let targets customize the behavior
(i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The
two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'.
By default, both methods return 'false'.
On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has
LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI.

Differential Revision: http://reviews.llvm.org/D6728


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 11:07:35 +00:00
Elena Demikhovsky
8499a501e4 Scalarizer for masked load and store intrinsics.
Masked vector intrinsics are a part of common LLVM IR, but they are really supported on AVX2 and AVX-512 targets. I added a code that translates masked intrinsic for all other targets. The masked vector intrinsic is converted to a chain of scalar operations inside conditional basic blocks.

http://reviews.llvm.org/D6436



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224897 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-28 08:54:45 +00:00
David Majnemer
bd64447bf3 PowerPC: CTR shouldn't fire if a TLS call is in the loop
Determining the address of a TLS variable results in a function call in
certain TLS models.  This means that a simple ICmpInst might actually
result in invalidating the CTR register.

In such cases, do not attempt to rely on the CTR register for loop
optimization purposes.

This fixes PR22034.

Differential Revision: http://reviews.llvm.org/D6786

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-27 19:45:38 +00:00
Keno Fischer
cc80af1b4f [FastIsel][X86] Fix invalid register replacement for bool args
Summary:
Consider the following IR:

  %3 = load i8* undef
  %4 = trunc i8 %3 to i1
  %5 = call %jl_value_t.0* @foo(..., i1 %4, ...)
  ret %jl_value_t.0* %5

Bools (that are the result of direct truncs) are lowered as whatever
the argument to the trunc was and a "and 1", causing the part of the
MBB responsible for this argument to look something like this:

  %vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7

Later, when the load is lowered, it will insert

  %vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14

but remember to (at the end of isel) replace vreg7 by vreg15. Now for
the bug. In fast isel lowering, we mistakenly mark vreg8 as the result
of the load instead of the trunc. This adds a fixup to have
vreg8 replaced by whatever the result of the load is as well, so
we end up with

  %vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15

which is an SSA violation and causes problems later down the road.

This fixes PR21557.

Test Plan: Test test case from PR21557 is added to the test suite.

Reviewers: ributzka

Reviewed By: ributzka

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6245

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224884 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-27 13:10:15 +00:00
Rafael Espindola
aceb47b808 Convert test to llvm-readobj. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 22:47:39 +00:00
Colin LeMahieu
17946361cc [Hexagon] Adding auto-incrementing loads with and without byte reversal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 21:09:25 +00:00
Colin LeMahieu
de2cee5556 [Hexagon] Adding locked loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 20:42:27 +00:00
Colin LeMahieu
6ff5e4862d [Hexagon] Adding deallocframe and circular addressing loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224869 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 20:30:58 +00:00
Colin LeMahieu
ffba450190 [Hexagon] Adding remaining post-increment instruction variants. Removing unused classes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224868 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 19:31:46 +00:00
Colin LeMahieu
a46bee194d [Hexagon] Adding post-increment unsigned byte loads.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224867 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 19:12:11 +00:00
Colin LeMahieu
3c52b7b9f2 [Hexagon] Adding post-increment signed byte loads with tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224866 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 18:57:13 +00:00
Rafael Espindola
21b9b20f36 Use llvm-readobj. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224864 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 18:22:05 +00:00
Craig Topper
a996db696b [X86] Add the debug registers DR8-DR15 so we can assemble and disassemble references to them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 18:20:05 +00:00
Craig Topper
6eb3e3ce10 [X86] Don't fail disassembly if REX.R/REX.B is used on an MMX register. Similar fix to not fail to disassembler CR9-CR15 references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224861 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 18:19:44 +00:00
Timur Iskhodzhanov
f4076dc995 Band-aid fix for PR22032: don't emit DWARF debug info if AddressSanitizer is enabled on Windows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224860 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 17:00:51 +00:00
Rafael Espindola
3bea6d7604 No need to run llvm-as. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224859 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 16:42:47 +00:00
David Majnemer
7627d9c229 InstCombine: Infer nuw for multiplies
A multiply cannot unsigned wrap if there are bitwidth, or more, leading
zero bits between the two operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224849 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 09:50:35 +00:00
David Majnemer
998ae69abe InstCombe: Infer nsw for multiplies
We already utilize this logic for reducing overflow intrinsics, it makes
sense to reuse it for normal multiplies as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 09:10:14 +00:00
Craig Topper
654a66dbd3 Teach disassembler to handle illegal immediates on (v)cmpps/pd/ss/sd instructions. Instead of rejecting we'll just generate the _alt forms that don't try to alter the mnemonic. While I'm here, merge some common code in the Instruction printers for the condition code replacement and fix the mask on SSE to be 3-bits instead of 4.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224846 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-26 06:36:28 +00:00
Hal Finkel
d7b2788e51 [PowerPC] [FastISel] i1 constants must be zero extended
When materializing constant i1 values, they must be zero extended. We represent
i1 values as [0, 1], not [0, -1], in i32 registers. As it turns out, this code
path was dead for i1 values prior to r216006 (which is why this did not manifest in
miscompiles until recently).

Fixes -O0 self-hosting on PPC64/Linux.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224842 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-25 23:08:25 +00:00
Elena Demikhovsky
b31322328a Masked Load/Store - Changed the order of parameters in intrinsics.
No functional changes.
The documentation is coming.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224829 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-25 07:49:20 +00:00
David Majnemer
e277a13a71 CodeGen: Error on redefinitions instead of asserting
It's possible to have a prior definition of a symbol in module asm.
Raise an error instead of crashing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224828 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 23:06:55 +00:00
David Majnemer
d36cad9914 CodeGen: Allow aliases to be overridden by variables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 22:44:29 +00:00
David Majnemer
e54eacce75 MC: Label definitions are permitted after .set directives
.set directives may be overridden by other .set directives as well as
label definitions.

This fixes PR22019.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224811 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 10:27:50 +00:00
Saleem Abdulrasool
3681929e11 IAS: correct debug line info for asm macros
Correct the line information generation for preprocessed assembly.  Although we
tracked the source information for the macro instantiation, we failed to account
for the fact that we were instantiating a macro, which is populated into a new
buffer and that the line information would be relative to the definition rather
than the actual instantiation location.  This could cause the line number
associated with the statement to be very high due to wrapping of the difference
calculated for the preprocessor line information emitted into the stream.
Properly calculate the line for the macro instantiation, referencing the line
where the macro is actually used as GCC/gas do.

The test case uses x86, though the same problem exists on any other target using
the LLVM IAS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224810 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 06:32:43 +00:00
David Majnemer
4714bfa1db MC: Don't emit .no_dead_strip on targets which don't support it
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-24 04:11:42 +00:00