Commit Graph

9504 Commits

Author SHA1 Message Date
Rafael Espindola
21a9fd247e Fix mingw32 thiscall + sret.
Unlike msvc, when handling a thiscall + sret gcc will
* Put the sret in %ecx
* Put the this pointer is (%esp)

This fixes, for example, calling stringstream::str.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196312 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 20:51:23 +00:00
James Molloy
616c94ba87 Addrspacecasts are no-ops on ARM.
Testcase added.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196269 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 11:23:11 +00:00
Richard Sandiford
90a34679ef [SystemZ] Fix choice of known-zero mask in insertion optimization
The backend converts 64-bit ORs into subreg moves if the upper 32 bits
of one operand and the low 32 bits of the other are known to be zero.
It then tries to peel away redundant ANDs from the upper 32 bits.

Since AND masks are canonicalized to exclude known-zero bits,
the test ORs the mask and the known-zero bits together before
checking for redundancy.  The problem was that it was using the
wrong node when checking for known-zero bits, so could drop ANDs
that were still needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196267 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 11:01:54 +00:00
Michael Liao
239ffb30b0 Enhance the fix of PR17631
- The fix to PR17631 fixes part of the cases where 'vzeroupper' should
  not be issued before 'call' insn. There're other cases where helper
  calls will be inserted not limited to epilog. These helper calls do
  not follow the standard calling convention and won't clobber any YMM
  registers. (So far, all call conventions will clobber any or part of
  YMM registers.)
  This patch enhances the previous fix to cover more cases 'vzerosupper' should
  not be inserted by checking if that function call won't clobber any YMM
  registers and skipping it if so.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196261 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 09:17:32 +00:00
Hao Liu
1296bb3ba6 [AArch64]Add missing floating point convert, round and misc intrinsics.
E.g. int64x1_t vcvt_s64_f64(float64x1_t a) -> FCVTZS Dd, Dn


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196210 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 06:06:55 +00:00
Hao Liu
5025a48f68 AArch64: add missing ACLE intrinsics mapping to general arithmetic operation from VFP instructions.
E.g. float64x1_t vadd_f64(float64x1_t a, float64x1_t b) -> FADD Dd, Dn, Dm.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196208 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 05:58:30 +00:00
Hao Liu
3d69ff4d07 AArch64: Add missing scalar pair intrinsics.
E.g. "float32_t vaddv_f32(float32x2_t a)" to be matched into "faddp s0, v1.2s".


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196198 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 03:39:47 +00:00
Jiangning Liu
bbc450c5cf Add some missing pattern matches for AArch64 Neon intrinsics like vuqadd_s64 and friends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196192 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 01:33:52 +00:00
Jiangning Liu
7f1f8d4146 Add some missing pattern matches for AArch64 Neon intrinsics like vmull_high_n_s16 and friends.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196190 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-03 01:29:32 +00:00
Chad Rosier
d4809bb0e3 [AArch64] Implemented vcopy_lane patterns using scalar DUP instruction.
Patch by Ana Pazos!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196151 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 21:05:16 +00:00
Vincent Lejeune
7043c7a35e R600: Workaround for cayman loop bug
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196121 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 17:29:37 +00:00
Tim Northover
ad249171e4 ARM: decide whether to use movw/movt based on "minsize" attribute.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196102 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 14:46:26 +00:00
Robert Lytton
f19c6f5763 XCore target: Make handling of large frames not dependent upon an FP.
eliminateFrameIndex() has been reworked to handle both small & large frames
with either a FP or SP.
An additional Slot is required for Scavenging spills when not using FP for large frames.
Reworked the handling of Register Scavenging.

Whether we are using an FP or not, whether it is a large frame or not,
and whether we are using a large code model or not are now independent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196091 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 11:05:28 +00:00
Tim Northover
f715d51769 ARM: add pseudo-instructions for lit-pool global materialisation
These are used by MachO only at the moment, and (much like the existing
MOVW/MOVT set) work around the fact that the labels used in the actual
instructions often contain PC-dependent components, which means that repeatedly
materialising the same global can't be CSEed.

With small modifications, it could be adapted to how ELF finds the address of
_GLOBAL_OFFSET_TABLE_, which would give similar benefits in PIC mode there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196090 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:35:41 +00:00
Robert Lytton
41d4ed4ed0 XCore target: fix large code model 'select' indirect address handling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196088 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:18:37 +00:00
Robert Lytton
883abacb5b XCore target: Add large code model
When using large code model:
Global objects larger than 'CodeModelLargeSize' bytes are placed in sections named with a trailing ".large"
The folded global address of such objects are lowered into the const pool.

During inspection it was noted that LowerConstantPool() was using a default offset of zero.
A fix was made, but due to only offsets of zero being generated, testing only verifies the change is not detrimental.

Correct the flags emitted for explicitly specified sections.

We assume the size of the object queried by getSectionForConstant() is never greater than CodeModelLargeSize.
To handle greater than CodeModelLargeSize, changes to AsmPrinter would be required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196087 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:18:31 +00:00
Robert Lytton
7112fa0523 XCore target: extend tests in preparation
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196086 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:18:24 +00:00
Robert Lytton
25464b948f XCore target: Fix eliminateFrameIndex() to handle large frames
Large frame offsets are loaded from the ConstantPool.
Where possible, offsets are encoded using the smaller MKMSK instruction.
Large frame offsets can only be used when there is a frame-pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196085 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:18:19 +00:00
Robert Lytton
1326c6f14b XCore target: Enable frames larger than 65535 to be lowered
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196084 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 10:18:14 +00:00
Rafael Espindola
ba7cb02009 Also test the created stubs on 32 bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196052 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 21:24:30 +00:00
Andrew Trick
1561e8381f Add -mcpu to stackmap.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196051 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 18:17:05 +00:00
Tim Northover
e54f6dca50 ARM: fix bug in -Oz stack adjustment folding
Previously, we clobbered callee-saved registers when folding an "add
sp, #N" into a "pop {rD, ...}" instruction. This change checks whether
a register we're going to add to the "pop" could actually be live
outside the function before doing so and should fix the issue.

This should fix PR18081.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196046 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-01 14:16:24 +00:00
Hal Finkel
6e1de2e63e Convert a PPC test from grep to FileCheck
Convert this test to FileCheck, and improve it to check for the instructions it
is trying to exclude instead of checking for register use (especially because
grepping for r1 can be thrown off, for example, by a use of r12).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195979 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 20:04:33 +00:00
Hal Finkel
7373abd81e Desensitize a couple of PPC regression tests
Use CHECK-DAG to make these regression tests more resilient against changes in
instruction scheduling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195978 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 19:52:28 +00:00
Hal Finkel
36e5511188 Update the cpu specified on some PPC regression tests
Some of these tests did not specify a cpu but were also sensitive to
instruction scheduling and/or register assignment choices. A few others
similarly-sensitive tests specified a cpu (often the POWER7), and while the P7
currently uses the default model for PPC64, this will soon change. For those
tests which should not really be cpu-dependent anyway, the cpu is set to the
generic 'ppc64'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195977 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 19:39:27 +00:00
Daniel Sanders
6fdef5ecd7 [mips][msa] MSA loads and stores have a 10-bit offset. Account for this when lowering FrameIndex.
This prevents the compiler from emitting invalid ld.[bhwd]'s and st.[bhwd]'s
when the stack frame is between 512 and 32,768 bytes in size.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195973 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 13:47:57 +00:00
Juergen Ributzka
1baf0c0924 Force CPU type to unbreak unit tests on Haswell machines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195971 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-30 03:07:16 +00:00
Reed Kotler
dcb0422f25 Part 1 of 3 patches that completes very long conditional branches
in constant islands for Mips16. We introdcuce JalB16 as a synomnym
for Jal16. It makes it easier to read and is also necessary because
Jal16 is a call instruction but JalB16 is being used as a branch.
Various parts of LLVM will not work properly even in this late stage of
the backend if we use what was declared as a call instruction to function
as a branch. For one, basic block labels may not get emitted in some
situations. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195968 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-29 22:32:56 +00:00
Hao Liu
7fd70e7b0c AArch64: The pattern match should check the range of the immediate value.
Or we can generate some illegal instructions.
E.g. shrn2 v0.4s, v1.2d, #35. The legal range should be in [1, 16].


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195941 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-29 02:11:22 +00:00
Jiangning Liu
d4685468fd Add missing test case for bsl_f64 support of AArch64 NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195939 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-29 01:38:08 +00:00
Reed Kotler
18a777a09a Check in conditional branches for constant islands. Still need to finish
conditional branches for very large targets. That will be the next small
patch. Everything now should in principle work as good (functionality
wise) as without constant islands so we decided at Mips/Imagination to
make constant islands the default for Mips16 now so that it will get
excercised a lot and this port is still experimentatl though hopefully soon
we will change the status. Some more cleanup and code review is in order
but things are converging fast.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195902 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-28 00:56:37 +00:00
Akira Hatanaka
bd44867477 [mips] Implement the following optimizations using dominance information to
make PIC calls a little more efficient:

1. Remove instructions setting up $gp if it is known that a function has been
   called at least once.
2. Save the address of a called function in a register instead of loading
   it from the GOT at every call site.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195892 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 23:38:42 +00:00
Tom Stellard
8a6b7df6f8 R600: Expand vector FABS
NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195881 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 21:23:39 +00:00
Tom Stellard
aa6ec15caf R600/SI: Implement spilling of SGPRs v5
SGPRs are spilled into VGPRs using the {READ,WRITE}LANE_B32 instructions.

v2:
  - Fix encoding of Lane Mask
  - Use correct register flags, so we don't overwrite the low dword
    when restoring multi-dword registers.

v3:
  - Register spilling seems to hang the GPU, so replace all shaders
    that need spilling with a dummy shader.

v4:
  - Fix *LANE definitions
  - Change destination reg class for 32-bit SMRD instructions

v5:
  - Remove small optimization that was crashing Serious Sam 3.

https://bugs.freedesktop.org/show_bug.cgi?id=68224
https://bugs.freedesktop.org/show_bug.cgi?id=71285

NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195880 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 21:23:35 +00:00
Tom Stellard
0cbf943733 R600/SI: Use SGPR_32 register class for 32-bit SMRD outputs
Writing to the M0 register from an SMRD instruction hangs the GPU, so
we need to use the SGPR_32 register class, which does not include M0.

NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195879 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 21:23:29 +00:00
Tom Stellard
496dbfe7b9 R600: Add support for ISD::FROUND
NOTE: This is a candidate for the 3.4 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195878 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 21:23:20 +00:00
Rafael Espindola
839b0556cd Use FileCheck and expand the test a bit.
In particular, check the name of the symbol we are putting in the constant pool.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195865 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 19:22:14 +00:00
Jiangning Liu
35df2e8c7f Fix the AArch64 NEON bug exposed by checking constant integer argument range of ACLE intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195843 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 14:02:25 +00:00
Rafael Espindola
ef8a810cd7 Cleanup and test X86AsmPrinter::printPCRelImm.
It is only used for asm printing.

On X86 we put basic block addresses on register before passing them to inline
asm, so the MO_MachineBasicBlock case was dead.

MO_ExternalSymbol was dead since any symbol being passed to inline asm
is represented as MO_GlobalAddress.

The MO_GlobalAddress and MO_Register cases were not tested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195824 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 06:53:13 +00:00
Chad Rosier
9fef0370c5 [AArch64] Add support for NEON scalar floating-point absolute difference.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195803 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-27 01:45:58 +00:00
Chad Rosier
48f115aabf [AArch64] Add support for NEON scalar floating-point to integer convert
instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195788 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 22:17:37 +00:00
Reed Kotler
c0dfa22e19 Fix a bug related to constant islands for Mips16 and mips16/32 dual mode.
The determination of when we are doing constant pools was being made too
early in the asm printer.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195781 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 20:38:40 +00:00
Michael Liao
fd115c47a2 Fix PR18054
- Fix bug in (vsext (vzext x)) -> (vsext x) in SIGN_EXTEND_IN_REG
  lowering where we need to check whether x is a vector type (in-reg
  type) of i8, i16 or i32; otherwise, that optimization is not valid.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195779 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 20:31:31 +00:00
Tim Northover
2254509d71 Darwin-ARM: use movw/movt for static relocations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195759 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 12:45:05 +00:00
Richard Sandiford
396e080b34 [SystemZ] Fix incorrect use of RISBG for a zero-extended right shift
We would wrongly transform the testcase into the equivalent of an AND with 1.
The problem was that, when testing whether the shifted-in bits of the right
shift were significant, we used the width of the final zero-extended result
rather than the width of the shifted value.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195731 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 10:53:16 +00:00
Kevin Qin
cf7ed12a1d Refactored the implementation of AArch64 NEON instruction ZIP, UZP
and TRN.
Fix a bug when mixed use of vget_high_u8() and vuzp_u8().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195716 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 03:26:47 +00:00
Kevin Qin
57f6b2778b [AArch64]Implement 128 bit register copy with NEON.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195713 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:33:42 +00:00
Andrew Trick
501aeea325 StackMap: Implement support for DirectMemRefOp.
A Direct stack map location records the address of frame index. This
address is itself the value that the runtime requested. This differs
from IndirectMemRefOp locations, which refer to a stack locations from
which the requested values must be loaded. Direct locations can
directly communicate the address if an alloca, while IndirectMemRefOp
handle register spills.

For example:

entry:
  %a = alloca i64...
  llvm.experimental.stackmap(i32 <ID>, i32 <shadowBytes>, i64* %a)

Since both the alloca and stackmap intrinsic are in the entry block,
and the intrinsic takes the address of the alloca, the runtime can
assume that LLVM will not substitute alloca with any intervening
value. This must be verified by the runtime by checking that the stack
map's location is a Direct location type. The runtime can then
determine the alloca's relative location on the stack immediately after
compilation, or at any time thereafter. This differs from Register and
Indirect locations, because the runtime can only read the values in
those locations when execution reaches the instruction address of the
stack map.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195712 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 02:03:25 +00:00
Cameron McInally
0e6ec124d5 Add an intrinsic for the SSE2 PAUSE instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195697 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-26 00:20:43 +00:00
Bill Wendling
5df09f0367 Unrevert r195599 with testcase fix.
I'm not sure how it was checking for the wrong values...
PR18023.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195670 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-25 18:05:22 +00:00