Commit Graph

81 Commits

Author SHA1 Message Date
Pekka Jaaskelainen
d72b4d321e Made the min-trip-count-switch test X86-specific to avoid
breakage with builds without X86-support.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174052 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-31 10:33:22 +00:00
Renato Golin
0261cea689 Adding simple cast cost to ARM
Changing ARMBaseTargetMachine to return ARMTargetLowering intead of
the generic one (similar to x86 code).

Tests showing which instructions were added to cast when necessary
or cost zero when not. Downcast to 16 bits are not lowered in NEON,
so costs are not there yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173849 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 23:31:38 +00:00
Pekka Jaaskelainen
d855049576 LoopVectorize: convert TinyTripCountVectorThreshold constant
to a command line switch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173837 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-29 21:42:08 +00:00
Nadav Rotem
f148c66ce4 Add support for reverse pointer induction variables. These are loops that contain pointers that count backwards.
For example, this is the hot loop in BZIP:

  do {
    m = *--p;
    *p = ( ... );
  } while (--n);



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173219 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-23 01:35:00 +00:00
Nadav Rotem
0bbbc52dc8 LoopVectorizer: Implement a new heuristics for selecting the unroll factor.
We ignore the cpu frontend and focus on pipeline utilization. We do this because we
don't have a good way to estimate the loop body size at the IR level.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172964 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-20 05:24:29 +00:00
Nadav Rotem
bcdabadaf4 Change the cpu type in the test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172963 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-20 05:20:56 +00:00
Benjamin Kramer
1af132dcf3 LoopVectorizer: Emit memory checks into their own basic block.
This separates the check for "too few elements to run the vector loop" from the
"memory overlap" check, giving a lot nicer code and allowing to skip the memory
checks when we're not going to execute the vector code anyways. We still leave
the decision of whether to emit the memory checks as branches or setccs, but it
seems to be doing a good job. If ugly code pops up we may want to emit them as
separate blocks too. Small speedup on MultiSource/Benchmarks/MallocBench/espresso.

Most of this is legwork to allow multiple bypass blocks while updating PHIs,
dominators and loop info.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172902 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-19 13:57:58 +00:00
Benjamin Kramer
c759dd5f83 Move test that depends on the x86 target into a target-specific directory.
Should fix the arm buildbot (which only builds the arm target).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172611 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-16 13:25:56 +00:00
Nadav Rotem
b6db95f42b Fix PR14547. Handle induction variables of small sizes smaller than i32 (i8 and i16).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172348 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-13 07:56:29 +00:00
Nadav Rotem
3e40d927a7 ARM Cost Model: Modify the target independent cost model to ask
the target if it supports the different CAST types. We didn't do this
on X86 because of the different register sizes and types, but on ARM
this makes sense.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172245 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 19:54:13 +00:00
Nadav Rotem
a675c74208 ARM Cost Model: We need to detect the max bitwidth of types in the loop in order to select the max vectorization factor.
We don't have a detailed analysis on which values are vectorized and which stay scalars in the vectorized loop so we use
another method. We look at reduction variables, loads and stores, which are the only ways to get information in and out
of loop iterations. If the data types are extended and truncated then the cost model will catch the cost of the vector
zext/sext/trunc operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172178 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-11 07:11:59 +00:00
Nadav Rotem
c560bf638b LoopVectorizer: Fix a bug in the vectorization of BinaryOperators. The BinaryOperator can be folded to an Undef, and we don't want to set NSW flags to undef vals.
PR14878



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172079 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-10 17:34:39 +00:00
Nadav Rotem
14925e6b88 ARM Cost model: Use the size of vector registers and widest vectorizable instruction to determine the max vectorization factor.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172010 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 22:29:00 +00:00
Nadav Rotem
df8c22a104 ARM Cost Model: Add a basic vectorization unrolling test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171931 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 01:29:07 +00:00
Nadav Rotem
3c90b3d5fd Remove the -licm pass from the loop vectorizer test because the loop vectorizer does it now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171930 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 01:20:59 +00:00
Nadav Rotem
83be7b0dd3 Cost Model: Move the 'max unroll factor' variable to the TTI and add initial Cost Model support on ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171928 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-09 01:15:42 +00:00
Nadav Rotem
111e5fe7e0 LoopVectorizer: Add support for floating point reductions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171812 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07 23:13:00 +00:00
Nadav Rotem
9a6c6a3736 LoopVectorizer: When we vectorizer and widen loops we process many elements at once. This is a good thing, except for
small loops. On small loops post-loop that handles scalars (and runs slower) can take more time to execute than the
rest of the loop. This patch disables widening of loops with a small static trip count.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171798 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07 21:54:51 +00:00
Nadav Rotem
024328ea49 Fix a typo. Remove the duplicated test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171584 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05 01:17:46 +00:00
Nadav Rotem
d5b92c3891 iLoopVectorize: Non commutative operators can be used as reduction variables as long as the reduction chain is used in the LHS.
PR14803.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171583 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05 01:15:47 +00:00
Nadav Rotem
f7737b5baa Force a fixed unroll count on the target independent tests.
This should fix clang-native-arm-cortex-a9. Thanks Renato.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171582 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-05 00:58:48 +00:00
Paul Redmond
5767d91956 Do not vectorize loops with subtraction reductions
Since subtraction does not commute the loop vectorizer incorrectly vectorizes
reductions such as x = A[i] - x.

Disabling for now.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171537 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04 22:10:16 +00:00
Nadav Rotem
e503319874 LoopVectorizer:
1. Add code to estimate register pressure.
2. Add code to select the unroll factor based on register pressure.
3. Add bits to TargetTransformInfo to provide the number of registers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171469 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-04 17:48:25 +00:00
Nadav Rotem
12b01e7b31 LoopVectorizer: Test the unrolling flag.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171446 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-03 01:47:31 +00:00
Nadav Rotem
00a6bcaeb4 Avoid vectorization when the function has the "noimplicitflot" attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171429 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02 23:54:43 +00:00
Nadav Rotem
db2367512e LoopVectorizer: Fix a bug in the code that updates the loop exiting block.
LCSSA PHIs may have undef values. The vectorizer updates values that are used by outside users such as PHIs.
The bug happened because undefs are not loop values. This patch handles these PHIs.

PR14725



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171251 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-30 07:47:00 +00:00
Nadav Rotem
5dd839430c If all of the write objects are identified then we can vectorize the loop even if the read objects are unidentified.
PR14719.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171124 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26 23:30:53 +00:00
Nadav Rotem
13eb1e7817 LoopVectorizer: Optimize the vectorization of consecutive memory access when the iteration step is -1
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171114 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-26 19:08:17 +00:00
Hal Finkel
1d59f5fa53 LoopVectorize: Enable vectorization of the fmuladd intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171076 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-25 23:21:29 +00:00
Nick Lewycky
c18f889ee0 Fix typo "Makre" -> "Make".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171043 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24 19:55:47 +00:00
Nadav Rotem
9e5329d77e LoopVectorizer: When checking for vectorizable types, also check
the StoreInst operands.

PR14705.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171023 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24 09:14:18 +00:00
Nadav Rotem
470ea9b72f LoopVectorizer: Fix an endless loop in the code that looks for reductions.
The bug was in the code that detects PHIs in if-then-else block sequence.

PR14701.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171008 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-24 01:22:06 +00:00
Nadav Rotem
6f3d81a929 CostModel: Change the default target-independent implementation for finding
the cost of arithmetic functions. We now assume that the cost of arithmetic
operations that are marked as Legal or Promote is low, but ops that are
marked as custom are higher.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171002 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-23 17:31:23 +00:00
Nadav Rotem
d54fed2786 Loop Vectorizer: Update the cost model of scatter/gather operations and make
them more expensive.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170995 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-23 07:23:55 +00:00
Nadav Rotem
55306bdea5 Fix a bug in the code that checks if we can vectorize loops while using dynamic
memory bound checks.  Before the fix we were able to vectorize this loop from
the Livermore Loops benchmark:

for ( k=1 ; k<n ; k++ )
  x[k] = x[k-1] + y[k];



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170811 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-21 00:07:35 +00:00
Nadav Rotem
8386acd734 LoopVectorize: Fix a bug in the scalarization of instructions.
Before if-conversion we could check if a value is loop invariant
if it was declared inside the basic block. Now that loops have
multiple blocks this check is incorrect.

This fixes External/SPEC/CINT95/099_go/099_go



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170756 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-20 20:24:40 +00:00
Benjamin Kramer
35d3462941 Make TargetLowering::getTypeConversion more resilient against odd illegal MVTs.
- An MVT can become an EVT when being split (e.g. v2i8 -> v1i8, the latter doesn't exist)
- Return the scalar value when an MVT is scalarized (v1i64 -> i64)

Fixes PR14639ff.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170546 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-19 14:34:28 +00:00
Benjamin Kramer
0ef0e2e6d0 LoopVectorize: Emit reductions as log2(vectorsize) shuffles + vector ops instead of scalar operations.
For example on x86 with SSE4.2 a <8 x i8> add reduction becomes
	movdqa	%xmm0, %xmm1
	movhlps	%xmm1, %xmm1            ## xmm1 = xmm1[1,1]
	paddw	%xmm0, %xmm1
	pshufd	$1, %xmm1, %xmm0        ## xmm0 = xmm1[1,0,0,0]
	paddw	%xmm1, %xmm0
	phaddw	%xmm0, %xmm0
	pextrb	$0, %xmm0, %edx

instead of
	pextrb	$2, %xmm0, %esi
	pextrb	$0, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$4, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$6, %xmm0, %edx
	addb	%sil, %dl
	pextrb	$8, %xmm0, %esi
	addb	%dl, %sil
	pextrb	$10, %xmm0, %edi
	pextrb	$14, %xmm0, %edx
	addb	%sil, %dil
	pextrb	$12, %xmm0, %esi
	addb	%dil, %sil
	addb	%sil, %dl

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170439 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-18 18:40:20 +00:00
Nadav Rotem
807dad62a0 Teach the cost model about the optimization in r169904: Truncation of induction variables costs the same as scalar trunc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170051 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-13 00:21:03 +00:00
Nadav Rotem
ae3b652f5c LoopVectorizer: Use the "optsize" attribute to decide if we are allowed to increase the function size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170004 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-12 19:29:45 +00:00
Nadav Rotem
655d2c5354 PR14574. Fix a bug in the code that calculates the mask the converted PHIs in if-conversion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169916 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 21:30:14 +00:00
Nadav Rotem
5e9efa10fc Loop Vectorize: optimize the vectorization of trunc(induction_var). The truncation is now done on scalars.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169904 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 18:58:10 +00:00
Nadav Rotem
cfb6285fdb Fix PR14565. Don't if-convert loops that have switch statements in them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169813 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-11 04:55:10 +00:00
Nadav Rotem
f0d19bd129 Add support for reverse induction variables. For example:
while (i--)
 sum+=A[i];



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169752 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-10 19:25:06 +00:00
Paul Redmond
880166684e LoopVectorize: support vectorizing intrinsic calls
- added function to VectorTargetTransformInfo to query cost of intrinsics
- vectorize trivially vectorizable intrinsic calls such as sin, cos, log, etc.

Reviewed by: Nadav


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169711 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-09 20:42:17 +00:00
Nadav Rotem
e570dee4b0 Fix a bug in vectorization of if-converted reduction variables. If the
reduction variable is not used outside the loop then we ran into an
endless loop. This change checks if we found the original PHI.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169324 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 22:40:22 +00:00
Nadav Rotem
f6088d126e Add support for reduction variables when IF-conversion is enabled.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169288 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 18:17:33 +00:00
Nadav Rotem
319d594e22 Add the last part that is needed for vectorization of if-converted code.
Added the code that actually performs the if-conversion during vectorization.

We can now vectorize this code:

for (int i=0; i<n; ++i) {
  unsigned k = 0;

  if (a[i] > b[i])   <------ IF inside the loop.
    k = k * 5 + 3;

  a[i] = k;          <---- K is a phi node that becomes vector-select.
}



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169217 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04 06:15:11 +00:00
Nadav Rotem
0af63ac245 Add support for pointer induction variables even when there is no integer induction variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168558 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-25 08:41:35 +00:00
Nadav Rotem
9a6823516f LoopVectorizer: Add initial support for pointer induction variables (for example: *dst++ = *src++).
At the moment we still require to have an integer induction variable (for example: i++).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168231 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17 00:27:03 +00:00