Commit Graph

72240 Commits

Author SHA1 Message Date
Sanjay Patel
d1a09c47d2 name change: isPow2DivCheap -> isPow2SDivCheap
isPow2DivCheap

That name doesn't specify signed or unsigned.

Lazy as I am, I eventually read the function and variable comments. It turns out that this is strictly about signed div. But I discovered that the comments are wrong:

   srl/add/sra

is not the general sequence for signed integer division by power-of-2. We need one more 'sra':

   sra/srl/add/sra

That's the sequence produced in DAGCombiner. The first 'sra' may be removed when dividing by exactly '2', but that's a special case.

This patch corrects the comments, changes the name of the flag bit, and changes the name of the accessor methods.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D5010


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216237 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:31:48 +00:00
Quentin Colombet
bd66db27fd [PeepholeOptimizer] Enable the advanced copy optimization by default.
The advanced copy optimization does not yield any difference on the whole llvm
test-suite + SPECs, either in compile time or runtime (binaries are identical),
but has a big potential when data go back and forth between register files as
demonstrated with test/CodeGen/ARM/adv-copy-opt.ll.

Note: This was measured for both Os and O3 for armv7s, arm64, and x86_64.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216236 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:23:52 +00:00
Philip Reames
9bdd5df2ca Whitespace change to reduce diff in future patch.
Patch 2 of 11 in 'Add a "probe-stack" attribute' review thread

Patch by: john.kare.alsaker@gmail.com



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216235 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:19:16 +00:00
Philip Reames
ecad452885 [X86] Split out the logic to select the stack probe function (NFC)
Patch 1 of 11 in 'Add a "probe-stack" attribute' review thread.

Patch by: <john.kare.alsaker@gmail.com>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216233 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 22:15:20 +00:00
Robin Morisset
cf165c36ee Rename AtomicExpandLoadLinked into AtomicExpand
AtomicExpandLoadLinked is currently rather ARM-specific. This patch is the first of
a group that aim at making it more target-independent. See
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075873.html
for details

The command line option is "atomic-expand"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:50:01 +00:00
Quentin Colombet
4921d1af7d [PeepholeOptimizer] Update the kill flags when extending the live-range of the
source of a copy.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216229 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:34:06 +00:00
Justin Bogner
6d2164ec3d Fix a URL (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216228 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 21:09:24 +00:00
Juergen Ributzka
5d6365c80c [FastISel][AArch64] Use the correct register class to make the MI verifier happy.
This is mostly achieved by providing the correct register class manually,
because getRegClassFor always returns the GPR*AllRegClass for MVT::i32 and
MVT::i64.

Also cleanup the code to use the FastEmitInst_* method whenever possible. This
makes sure that the operands' register class is properly constrained. For all
the remaining cases this adds the missing constrainOperandRegClass calls for
each operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216225 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:57:57 +00:00
David Blaikie
95ca0fb247 Explicitly pass ownership of the MemoryBuffer to AddNewSourceBuffer using std::unique_ptr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216223 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:44:56 +00:00
Tom Stellard
fdbf61d00d R600/SI: Teach moveToVALU how to handle more S_LOAD_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:41:00 +00:00
Tom Stellard
5f52739370 R600/SI: Make sure SCRATCH_WAVE_OFFSET is added as Live-In to the function
This fixes a crash in an ocl conformance test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216219 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:40:58 +00:00
Tom Stellard
7af96a25fc R600/SI: Remove unused SGPR spilling code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216218 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:40:56 +00:00
Tom Stellard
9b60cb102a R600/SI: Use eliminateFrameIndex() to expand SGPR spill pseudos
This will simplify the SGPR spilling and also allow us to use
MachineFrameInfo for calculating offsets, which should be more
reliable than our custom code.

This fixes a crash in some cases where a register would be spilled
in a branch such that the VGPR defined for spilling did not dominate
all the uses when restoring.

This fixes a crash in an ocl conformance test.  The test requries
register spilling and is too big to include.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216217 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:40:54 +00:00
Tom Stellard
a07c0778ca R600/SI: Handle VCC in SIRegisterInfo::getPhysRegSubReg()
This fixes a crash in an ocl conformance test.  The test requries
register spilling and is too big to include.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216216 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:40:50 +00:00
Rafael Espindola
7b4eb02b6d Move some logic to populateLTOPassManager.
This will avoid code duplication in the next commit which calls it directly
from the gold plugin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216211 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 20:03:44 +00:00
Adam Nemet
9db660ecaa [AVX512] Add class to group common template arguments related to vector type
We discussed the issue of generality vs. readability of the AVX512 classes
recently.  I proposed this approach to try to hide and centralize the mappings
we commonly perform based on the vector type.  A new class X86VectorVTInfo
captures these.

The idea is to pass an instance of this class to classes/multiclasses instead
of the corresponding ValueType.  Then the class/multiclass can use its field
for things that derive from the type rather than passing all those as separate
arguments.

I modified avx512_valign to demonstrate this new approach.  As you can see
instead of 7 related template parameters we now have one.  The downside is
that we have to refer to fields for the derived values.  I named the argument
'_' in order to make this as invisible as possible.  Please let me know if you
absolutely hate this.  (Also once we allow local initializations in
multiclasses we can recover the original version by assigning the fields to
local variables.)

Another possible use-case for this class is to directly map things, e.g.:

  RegisterClass KRC = X86VectorVTInfo<32, i16>.KRC

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216209 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 19:50:07 +00:00
Alex Lorenz
93695d2395 Coverage Mapping: add function's hash to coverage function records.
The profile data format was recently updated and the new indexing api
requires the code coverage tool to know the function's hash as well
as the function's name to get the execution counts for a function.

Differential Revision: http://reviews.llvm.org/D4994


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216207 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 19:23:25 +00:00
Rafael Espindola
4658ea9dc7 Respect LibraryInfo in populateLTOPassManager and use it. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216203 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:49:52 +00:00
Rafael Espindola
8d1b742f7f Remove dead code. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216201 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:11:21 +00:00
Quentin Colombet
ad3c6289b6 [AArch64] Run a peephole pass right after AdvSIMD pass.
The AdvSIMD pass may produce copies that are not coalescer-friendly. The
peephole optimizer knows how to fix that as demonstrated in the test case.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216200 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:10:07 +00:00
Juergen Ributzka
60aadd5d8b [FastISel][AArch64] Factor out ANDWri instruction generation into a helper function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216199 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 18:02:25 +00:00
Moritz Roth
a6afad8b33 Thumb1 load/store optimizer: Improve code to materialize new base register.
There are two add-immediate instructions in Thumb1: tADDi8 and tADDi3. Only
the latter supports using different source and destination registers, so
whenever we materialize a new base register (at a certain offset) we'd do
so by moving the base register value to the new register and then adding in
place. This patch changes the code to use a single tADDi3 if the offset is
small enough to fit in 3 bits.

Differential Revision: http://reviews.llvm.org/D5006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216193 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 17:11:03 +00:00
Jonathan Roelofs
4c3be1aa0f Add a thread-model knob for lowering atomics on baremetal & single threaded systems
http://reviews.llvm.org/D4984


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216182 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 14:35:47 +00:00
Rafael Espindola
0b994a70b0 Handle inlining in populateLTOPassManager like in populateModulePassManager.
No functionality change.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216178 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:35:30 +00:00
Zinovy Nis
63f5912959 [CLNUP] Remove return after llvm_unreachable. Thanks to Hal Finkel for pointing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216176 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:30:05 +00:00
Benjamin Kramer
daada81e5c DAGCombiner: Make concat_vector combine safe for EVTs and concat_vectors with many arguments.
PR20677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216175 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:28:02 +00:00
Rafael Espindola
47199c3d0c Move DisableGVNLoadPRE from populateLTOPassManager to PassManagerBuilder.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216174 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 13:13:17 +00:00
Josh Klontz
f19807db70 X86AsmPrinter MCJIT MSVC bug fix.
Summary:
This bug was introduced in r213006 which makes an assumption that MCSection is COFF for Windows MSVC. This assumption is broken for MCJIT users where ELF is used instead [1]. The fix is to change the MCSection cast to a dyn_cast.

[1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2013-December/068407.html.

Reviewers: majnemer

Reviewed By: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4872

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216173 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 12:55:27 +00:00
Oliver Stannard
760a46522a [ARM] Enable DP copy, load and store instructions for FPv4-SP
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.

This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 12:50:31 +00:00
Erik Verbruggen
0edc5e8391 Reassociate x + -0.1234 * y into x - 0.1234 * y
This does not require -ffast-math, and it gives CSE/GVN more options to
eliminate duplicate expressions in, e.g.:

  return ((x + 0.1234 * y) * (x - 0.1234 * y));

Differential Revision: http://reviews.llvm.org/D4904



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216169 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 10:45:30 +00:00
Benjamin Kramer
f11ac42d01 X86: Turn redundant if into an assertion.
While there remove noop casts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216168 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 10:31:37 +00:00
Robert Khasanov
fec1abaeab [x86] Added _addcarry_ and _subborrow_ intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216164 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:43:43 +00:00
Robert Khasanov
375a68ca15 [x86] SMAP: added HasSMAP attribute for CLAC/STAC, corrected attributes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216163 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:34:12 +00:00
Robert Khasanov
10dacc4b52 [x86] Broadwell: ADOX/ADCX. Added _addcarryx_u{32|64} intrinsics to LLVM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216162 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:27:00 +00:00
Robert Khasanov
3a34f5e115 [x86] Enable Broadwell target.
Added FeatureSMAP.

Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:16:12 +00:00
Zinovy Nis
164cd0161e [INDVARS] Extend using of widening of induction variables for the cases of "sub nsw" and "mul nsw" instructions.
Currently only "add nsw" are widened. This patch eliminates tons of "sext" instructions for 64 bit code (and the corresponding target code) in cases like:

int N = 100;
float **A;

void foo(int x0, int x1)
{
        float * A_cur = &A[0][0];
        float * A_next = &A[1][0];
        for(int x = x0; x < x1; ++x).
        {
          // Currently only [x+N] case is widened. Others 2 cases lead to sext.
          // This patch fixes it, so all 3 cases do not need sext.
          const float div = A_cur[x + N] + A_cur[x - N] + A_cur[x * N];
          A_next[x] = div;
        }
}
...
> clang++ test.cpp -march=core-avx2 -Ofast  -fno-unroll-loops -fno-tree-vectorize -S -o -

Differential Revision: http://reviews.llvm.org/D4695



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216160 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 08:25:45 +00:00
Elena Demikhovsky
6623af7892 IntelJITEventListener updates to fix breaks by recent changes to EngineBuilder and DIContext.
By Arch Robison.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 07:01:55 +00:00
Craig Topper
431bdfc4c1 Repace SmallPtrSet with SmallPtrSetImpl in function arguments to avoid needing to mention the size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216158 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 05:55:13 +00:00
David Majnemer
e234f93b3e InstCombine: Fold ((A | B) & C1) ^ (B & C2) -> (A & C1) ^ B if C1^C2=-1
Adapted from a patch by Richard Smith, test-case written by me.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216157 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 05:14:48 +00:00
Craig Topper
e88b796285 Remove custom implementations of max/min in StringRef that was originally added to work an old gcc bug. I believe its been fixed by now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216156 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 04:31:10 +00:00
Jiangning Liu
82f1a8cc09 Fix a bug around truncating vector in const prop.
In constant folding stage, "TRUNC" can't handle vector data type.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216149 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 02:12:35 +00:00
Jiangning Liu
150cef218f Revert r216066, "Optimize ZERO_EXTEND and SIGN_EXTEND in both SelectionDAG Builder and type".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216147 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 01:59:30 +00:00
Quentin Colombet
e817bdd304 [PeepholeOptimizer] Take advantage of the isInsertSubreg property in the
advanced copy optimization.

This is the final step patch toward transforming:
udiv    r0, r0, r2
udiv    r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16
bx      lr

into:
udiv    r0, r0, r2
udiv    r1, r1, r3
bx      lr

Indeed, thanks to this patch, this optimization is able to look through
vmov.32 d16[0], r0
vmov.32 d16[1], r1

and is able to rewrite the following sequence:
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov    r0, r1, d16

into simple generic GPR copies that the coalescer managed to remove.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216144 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:19:16 +00:00
Quentin Colombet
7599acc2af [ARM] Mark VSETLNi32 with the InsertSubreg property and implement the related
target hook.

This patch teaches the compiler that:
dX = VSETLNi32 dY, rZ, imm
is the same as:
dX = INSERT_SUBREG dY, rZ, translateImmToSubIdx(imm)

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216143 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:10:52 +00:00
James Molloy
824431f680 [LoopVectorize] Up the maximum unroll factor to 4 for AArch64
Only for Cortex-A57 and Cyclone for now, where it has shown wins.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 00:02:51 +00:00
James Molloy
bb819edc1f [LoopVectorizer] Limit unroll factor in the presence of nested reductions.
If we have a scalar reduction, we can increase the critical path length if the loop we're unrolling is inside another loop. Limit, by default to 2, so the critical path only gets increased by one reduction operation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216140 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:53:52 +00:00
Quentin Colombet
0d15213307 Add isInsertSubreg property.
This patch adds a new property: isInsertSubreg and the related target hooks:
TargetIntrInfo::getInsertSubregInputs and
TargetInstrInfo::getInsertSubregLikeInputs to specify that a target specific
instruction is a (kind of) INSERT_SUBREG.

The approach is similar to r215394.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216139 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:49:36 +00:00
Jonathan Roelofs
506ed4d4a5 Lower thumbv4t & thumbv5 lo->lo copies through a push-pop sequence
On pre-v6 hardware, 'MOV lo, lo' gives undefined results, so such copies need to
be avoided. This patch trades simplicity for implementation time at the expense
of performance... As they say: correctness first, then performance.

See http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075998.html for a few
ideas on how to make this better.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:38:50 +00:00
Quentin Colombet
8ccce6b8fe [PeepholeOptimizer] Take advantage of the isExtractSubreg property in the
advanced copy optimization.

This patch is a step toward transforming:
udiv	r0, r0, r2
udiv	r1, r1, r3
vmov.32	d16[0], r0
vmov.32	d16[1], r1
vmov	r0, r1, d16
bx	lr

into:
udiv	r0, r0, r2
udiv	r1, r1, r3
bx	lr

Indeed, thanks to this patch, this optimization is able to look through
vmov r0, r1, d16
but it does not understand yet
vmov.32 d16[0], r0
vmov.32 d16[1], r1

Comming patches will fix that and update the related test case.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 23:13:02 +00:00
Yi Jiang
ee1b45f2a2 New InstCombine pattern: (icmp ult/ule (A + C1), C3) | (icmp ult/ule (A + C2), C3) to (icmp ult/ule ((A & ~(C1 ^ C2)) + max(C1, C2)), C3) under certain condition
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216135 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-20 22:55:40 +00:00