Commit Graph

8011 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen
2b9355f2d9 SPARC v9 stack pointer bias.
64-bit SPARC v9 processes use biased stack and frame pointers, so the
current function's stack frame is located at %sp+BIAS .. %fp+BIAS where
BIAS = 2047.

This makes more local variables directly accessible via [%fp+simm13]
addressing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178965 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06 21:38:57 +00:00
Hal Finkel
839b909653 Implement PPCInstrInfo::FoldImmediate
There are certain PPC instructions into which we can fold a zero immediate
operand. We can detect such cases by looking at the register class required
by the using operand (so long as it is not otherwise constrained).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178961 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06 19:30:30 +00:00
Jakob Stoklund Olesen
1f25fe5023 Complete formal arguments for the SPARC v9 64-bit ABI.
All arguments are formally assigned to stack positions and then promoted
to floating point and integer registers. Since there are more floating
point registers than integer registers, this can cause situations where
floating point arguments are assigned to registers after integer
arguments that where assigned to the stack.

Use the inreg flag to indicate 32-bit fragments of structs containing
both float and int members.

The three-way shadowing between stack, integer, and floating point
registers requires custom argument lowering. The good news is that
return values are passed in the exact same way, and we can share the
code.

Still missing:

 - Update LowerReturn to handle structs returned in registers.
 - LowerCall.
 - Variadic functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178958 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-06 18:32:12 +00:00
Tom Stellard
17ea10cb79 R600/SI: Add support for buffer stores v2
v2:
  - Use the ADDR64 bit

Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178931 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 23:31:51 +00:00
Tom Stellard
2a4d3e7e87 R600/SI: Add processor types for each SI variant
Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178928 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 23:31:35 +00:00
Tom Stellard
2fc7443498 R600/SI: Avoid generating S_MOVs with 64-bit immediates v2
SITargetLowering::analyzeImmediate() was converting the 64-bit values
to 32-bit and then checking if they were an inline immediate.  Some
of these conversions caused this check to succeed and produced
S_MOV instructions with 64-bit immediates, which are illegal.

v2:
  - Clean up logic

Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178927 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 23:31:20 +00:00
Hal Finkel
ff56d1a201 Enable early if conversion on PPC
On cores for which we know the misprediction penalty, and we have
the isel instruction, we can profitably perform early if conversion.
This enables us to replace some small branch sequences with selects
and avoid the potential stalls from mispredicting the branches.

Enabling this feature required implementing canInsertSelect and
insertSelect in PPCInstrInfo; isel code in PPCISelLowering was
refactored to use these functions as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178926 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 23:29:01 +00:00
Timur Iskhodzhanov
f340d34a97 Make the test/CodeGen/X86/win32_sret.ll reliable on any CPU by explicitly specifying the -mcpu
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178885 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 17:05:56 +00:00
Renato Golin
84581daf20 Reverting 178851 as it broke buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178883 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 16:39:53 +00:00
Stepan Dyatkovskiy
89becbb974 Fix for PR14824: "Optimization arm_ldst_opt inserts newly generated instruction vldmia at incorrect position".
Patch introduces memory operands tracking in ARMLoadStoreOpt::LoadStoreMultipleOpti. For each register it keeps the order of load operations as it was before optimization pass.
It is kind of deep improvement of fix proposed by Hao: http://llvm.org/bugs/show_bug.cgi?id=14824#c4
But it also tracks conflicts between different register classes (e.g. D2 and S5).
For more details see:
Bug description: http://llvm.org/bugs/show_bug.cgi?id=14824
LLVM Commits discussion: 
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130311/167936.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130318/168688.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130325/169376.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130401/170238.html



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178851 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 05:52:14 +00:00
Andrew Trick
614dacc910 RegisterPressure heuristics currently require signed comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178823 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-05 00:31:34 +00:00
Hal Finkel
7530a9f7d1 PPC: Improve code generation for mixed-precision reciprocal sqrt
The DAGCombine logic that recognized a/sqrt(b) and transformed it into
a multiplication by the reciprocal sqrt did not handle cases where the
sqrt and the division were separated by an fpext or fptrunc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178801 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 22:44:12 +00:00
Jakob Stoklund Olesen
ee27cac9fa Avoid high-latency false CPSR dependencies even for tMOVSi.
The Thumb2SizeReduction pass avoids false CPSR dependencies, except it
still aggressively creates tMOVi8 instructions because they are so
common.

Avoid creating false CPSR dependencies even for tMOVi8 instructions when
the the CPSR flags are known to have high latency. This allows integer
computation to overlap floating point computations.

Also process blocks in a reverse post-order and propagate high-latency
flags to successors.

<rdar://problem/13468102>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178773 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 18:25:36 +00:00
Stepan Dyatkovskiy
ed89568948 New-password-test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178765 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 16:11:18 +00:00
Vincent Lejeune
39cd6fae34 R600: Take export into account when computing cf address
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178761 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 13:59:59 +00:00
Jakob Stoklund Olesen
0e16488442 Add SPARC v9 support for select on 64-bit compares.
This requires v9 cmov instructions using the %xcc flags instead of the
%icc flags.

Still missing:
- Select floats on %xcc flags.
- Select i64 on %fcc flags.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178737 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-04 03:08:00 +00:00
Vincent Lejeune
5417223f98 R600: Fix last ALU of a clause being emitted in a separate clause
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178675 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 18:24:47 +00:00
Bill Schmidt
cd7a1558ed Fix PR15632: No support for ppcf128 floating-point remainder on PowerPC.
For this we need to use a libcall.  Previously LLVM didn't implement
libcall support for frem, so I've added it in the usual
straightforward manner.  A test case from the bug report is included.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178639 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 13:05:44 +00:00
Timur Iskhodzhanov
e79c17856e Temporarily relax the WIN32 checks in the SRet test to fix the Atom D2700 bot
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178635 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 12:17:15 +00:00
Timur Iskhodzhanov
eea35066ab Fix SRet for thiscall in i686-pc-win32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178634 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 11:27:54 +00:00
Jakob Stoklund Olesen
8534e9998c Add 64-bit compare + branch for SPARC v9.
The same compare instruction is used for 32-bit and 64-bit compares. It
sets two different sets of flags: icc and xcc.

This patch adds a conditional branch instruction using the xcc flags for
64-bit compares.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178621 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 04:41:44 +00:00
Hal Finkel
827307b95f Use PPC reciprocal estimates with Newton iteration in fast-math mode
When unsafe FP math operations are enabled, we can use the fre[s] and
frsqrte[s] instructions, which generate reciprocal (sqrt) estimates, together
with some Newton iteration, in order to quickly generate floating-point
division and sqrt results. All of these instructions are separately optional,
and so each has its own feature flag (except for the Altivec instructions,
which are covered under the existing Altivec flag). Doing this is not only
faster than using the IEEE-compliant fdiv/fsqrt instructions, but allows these
computations to be pipelined with other computations in order to hide their
overall latency.

I've also added a couple of missing fnmsub patterns which turned out to be
missing (but are necessary for good code generation of the Newton iterations).
Altivec needs a similar fix, but that will probably be more complicated because
fneg is expanded for Altivec's v4f32.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178617 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-03 04:01:11 +00:00
Akira Hatanaka
67fdafe1cd [mips] Small update to the implementation of eh.return for Mips.
This patch initializes t9 to the handler address, but only if the relocation
model is pic. This handles the case where handler to which eh.return jumps 
points to the start of the function.

Patch by Sasa Stankovic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178588 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 23:02:07 +00:00
NAKAMURA Takumi
17020de0e9 llvm/test/CodeGen/X86: Unmark them out of XFAIL:cygming, in atomic{32|64}.ll and handle-move.ll, corresponding to r178549.
This reverts r176808, r176798, and r177914.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178583 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 22:35:08 +00:00
Bill Schmidt
debf7d345a Fix PR15630: Replace faulty stdcx. with stwcx.
When doing a partword atomic operation, a lwarx was being paired with
a stdcx. instead of a stwcx. when compiling for a 64-bit target.  The
target has nothing to do with it in this case; we always need a stwcx.

Thanks to Kai Nacke for reporting the problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178559 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 18:37:08 +00:00
Jakob Stoklund Olesen
423d674412 Don't attempt MTM heuristics without a scheduling model present.
This should fix the PPC buildbots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178558 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 18:26:45 +00:00
Chad Rosier
146b8c2129 [fast-isel] Use the correct API to disable FastLowerArguments for Win64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178549 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 16:31:41 +00:00
Arnold Schwaighofer
e737018a86 DAGCombiner: Merge store/loads when we have extload/truncstores
This is helps on architectures where i8,i16 are not legal but we have byte, and
short loads/stores. Allowing us to merge copies like the one below on ARM.

copy(char *a, char *b, int n) {
 do {
   int t0 = a[0];
   int t1 = a[1];
   b[0] = t0;
   b[1] = t1;

radar://13536387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178546 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 15:58:51 +00:00
Preston Gurd
e97f84e991 Simplify test cases for Atom preferring call register indirect over
call memory indirect (32 and 64 bit).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178541 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 14:25:06 +00:00
Jakob Stoklund Olesen
61ed5ddefe Add 64-bit load and store instructions.
There is only a few new instructions, the rest is handled with patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178528 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 04:09:28 +00:00
Jakob Stoklund Olesen
73c5f80ec9 Basic 64-bit ALU operations.
SPARC v9 extends all ALU instructions to 64 bits, so we simply need to
add patterns to use them for both i32 and i64 values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178527 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 04:09:23 +00:00
Jakob Stoklund Olesen
39e75544dc Materialize 64-bit immediates.
The last resort pattern produces 6 instructions, and there are still
opportunities for materializing some immediates in fewer instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178526 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 04:09:17 +00:00
Jakob Stoklund Olesen
c3ff3f42ee Add 64-bit shift instructions.
SPARC v9 defines new 64-bit shift instructions. The 32-bit shift right
instructions are still usable as zero and sign extensions.

This adds new F3_Sr and F3_Si instruction formats that probably should
be used for the 32-bit shifts as well. They don't really encode an
simm13 field.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178525 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 04:09:12 +00:00
Jakob Stoklund Olesen
f37812e906 Add support for 64-bit calling convention.
This is far from complete, but it is enough to make it possible to write
test cases using i64 arguments.

Missing features:
- Floating point arguments.
- Receiving arguments on the stack.
- Calls.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178523 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-02 04:09:02 +00:00
Vincent Lejeune
08001a5a15 R600: Add support for native control flow
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178505 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 21:48:05 +00:00
Vincent Lejeune
8e59191eb8 R600: Emit CF_ALU and use true kcache register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178503 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 21:47:42 +00:00
Hal Finkel
a1646ceb9a Fix a bad assert in PPCTargetLowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178489 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 18:42:58 +00:00
Hal Finkel
6c81b118ca Add triple to test/CodeGen/PowerPC/stfiwx-2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178486 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 18:18:44 +00:00
Arnold Schwaighofer
f28a29b776 Merge load/store sequences with adresses: base + index + offset
We would also like to merge sequences that involve a variable index like in the
example below.

    int index = *idx++
    int i0 = c[index+0];
    int i1 = c[index+1];
    b[0] = i0;
    b[1] = i1;

By extending the parsing of the base pointer to handle dags that contain a
base, index, and offset we can handle examples like the one above.

The dag for the code above will look something like:

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i8 load %index))))

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))

The code that parses the tree ignores the intermediate sign extensions. However,
if there is a sign extension it needs to be on all indexes.

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (add (i8 load %index)
                                     (i8 1))))
 vs

 (load (i64 add (i64 copyfromreg %c)
                (i64 signextend (i32 add (i32 signextend (i8 load %index))
                                         (i32 1)))))
radar://13536387

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178483 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 18:12:58 +00:00
Hal Finkel
4647919784 Add more PPC floating-point conversion instructions
The P7 and A2 have additional floating-point conversion instructions which
allow a direct two-instruction sequence (plus load/store) to convert from all
combinations (signed/unsigned i32/i64) <--> (float/double) (on previous cores,
only some combinations were directly available).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178480 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 17:52:07 +00:00
Hal Finkel
dc8efbae14 Fix PowerPC/cttz.ll to specify a cpu (and use FileCheck)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178472 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 16:31:56 +00:00
Hal Finkel
1fce88313e Add the PPC popcntw instruction
The popcntw instruction is available whenever the popcntd instruction is
available, and performs a separate popcnt on the lower and upper 32-bits.
Ignoring the high-order count, this can be used for the 32-bit input case
(saving on the explicit zero extension otherwise required to use popcntd).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178470 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-01 15:58:15 +00:00
Benjamin Kramer
b8f0d89d05 X86: Promote sitofp <8 x i16> to <8 x i32> when AVX is available.
A vector sext + sitofp is a lot cheaper than 8 scalar conversions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178448 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-31 12:49:15 +00:00
Hal Finkel
8049ab15e4 Add the PPC lfiwax instruction
This instruction is available on modern PPC64 CPUs, and is now used
to improve the SINT_TO_FP lowering (by eliminating the need for the
separate sign extension instruction and decreasing the amount of
needed stack space).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178446 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-31 10:12:51 +00:00
Hal Finkel
9ad0f4907b Cleanup PPC(64) i32 -> float/double conversion
The existing SINT_TO_FP code for i32 -> float/double conversion was disabled
because it relied on broken EXTSW_32/STD_32 instruction definitions. The
original intent had been to enable these 64-bit instructions to be used on CPUs
that support them even in 32-bit mode.  Unfortunately, this form of lying to
the infrastructure was buggy (as explained in the FIXME comment) and had
therefore been disabled.

This re-enables this functionality, using regular DAG nodes, but only when
compiling in 64-bit mode. The old STD_32/EXTSW_32 definitions (which were dead)
are removed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178438 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-31 01:58:02 +00:00
Benjamin Kramer
0b68b758bb DAGCombine: visitXOR can replace a node without returning it, bail out in that case.
Fixes the crash reported in PR15608.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178429 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 21:28:18 +00:00
Benjamin Kramer
42734cfb41 Change '@SECREL' suffix to GAS-compatible '@SECREL32'.
'@SECREL' is what is used by the Microsoft assembler, but GNU as expects '@SECREL32'.
With the patch, the MC-generated code works fine in combination with a recent GNU as (2.23.51.20120920 here).

Patch by David Nadlinger!
Differential Revision: http://llvm-reviews.chandlerc.com/D429

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178427 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 16:21:50 +00:00
Justin Holewinski
b24fc1c7f7 [NVPTX] Remove support for SM < 2.0. This was never fully supported anyway.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178417 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 14:29:30 +00:00
Justin Holewinski
21fdcb0271 [NVPTX] Add NVVMReflect pass to allow compile-time selection of
specific code paths.

This allows us to write code like:

  if (__nvvm_reflect("FOO"))
    // Do something
  else
    // Do something else

and compile into a library, then give "FOO" a value at kernel
compile-time so the check becomes a no-op.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178416 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 14:29:25 +00:00
Akira Hatanaka
fd2cd0db97 [mips] Add patterns for DSP indexed load instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178408 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-30 02:14:45 +00:00