Commit Graph

8339 Commits

Author SHA1 Message Date
Andrew Trick
81349a7435 Track IR ordering of SelectionDAG nodes 4/4.
Unit test cases for -pre-RA-sched=source.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182706 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-25 03:26:51 +00:00
Andrew Trick
dd0fb018a7 Track IR ordering of SelectionDAG nodes 3/4.
Remove the old IR ordering mechanism and switch to new one.  Fix unit
test failures.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182704 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-25 03:08:10 +00:00
Hal Finkel
80d10ded8c PPC: Initial support for permutation-based unaligned Altivec loads
Altivec only directly supports aligned loads, but the loads have a strange
property: If given an unaligned address, they truncate the address to the next
lower aligned address, and load from there.  This property, along with an extra
load and some special-purpose permutation-control instructions that generate
the appropriate permutations from the original unaligned address, allow
efficient lowering of aligned loads. This code uses the trick explained in the
Apple Velocity Engine optimization overview document to prevent the needed
extra load from possibly causing a page fault if the original address happens
to be aligned.

As noted in the FIXMEs, there are several additional optimizations that can be
performed to reduce the cost of these loads even more. These will be
implemented in future commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182691 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 23:00:14 +00:00
Diego Novillo
77226a03dc Add a new function attribute 'cold' to functions.
Other than recognizing the attribute, the patch does little else.
It changes the branch probability analyzer so that edges into
blocks postdominated by a cold function are given low weight.

Added analysis and code generation tests.  Added documentation for the
new attribute.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182638 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-24 12:26:52 +00:00
Tim Northover
5a02fc4b5f ARM: implement @llvm.readcyclecounter intrinsic
This implements the @llvm.readcyclecounter intrinsic as the specific
MRC instruction specified in the ARM manuals for CPUs with the Power
Management extensions.

Older CPUs had slightly different methods which may also have to be
implemented eventually, but this should cover all v7 cases.

rdar://problem/13939186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182603 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-23 19:11:20 +00:00
Tom Stellard
d078070f6a R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg
Patch by: Vincent Lejeune

https://bugs.freedesktop.org/show_bug.cgi?id=64877

NOTE: This is a candidate for the 3.3 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182600 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-23 18:26:42 +00:00
Jakob Stoklund Olesen
e0b59774cb Fix PR16110: Handle DBG_VALUE in ConnectedVNInfoEqClasses::Distribute().
Now that the LiveDebugVariables pass is running *after* register
coalescing, the ConnectedVNInfoEqClasses class needs to deal with
DBG_VALUE instructions.

This only comes up when rematerialization during coalescing causes the
remaining live range of a virtual register to separate into two
connected components.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182592 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-23 17:02:23 +00:00
Nick Lewycky
fa03ff99b2 Add missing test from r175092.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182564 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-23 07:46:13 +00:00
Nadav Rotem
23d1d5eb56 X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182507 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-22 19:28:41 +00:00
Benjamin Kramer
60ef6c9295 X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned.
Take #2 on fixing PR15977.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182486 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-22 17:01:12 +00:00
David Majnemer
3b4b5367da X86: Remove test instructions proceeding shift by immediate instructions
Allow LLVM to take advantage of shift instructions that set the ZF flag,
making instructions that test the destination superfluous.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182454 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-22 08:13:02 +00:00
Akira Hatanaka
2591b5c6c3 [mips] Rename option to make it compatible with gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182397 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 17:17:59 +00:00
Akira Hatanaka
1d4d32398d [mips] Add instruction selection patterns for blez and bgez.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182396 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 17:13:47 +00:00
Justin Holewinski
b9c26dcb24 [NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182394 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 16:51:30 +00:00
Justin Holewinski
c2b7f5fa51 Drop @llvm.annotation and @llvm.ptr.annotation intrinsics during codegen.
The intrinsic calls are dropped, but the annotated value is propagated.

Fixes PR 15253

Original patch by Zeng Bin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182387 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 14:37:16 +00:00
Benjamin Kramer
f106d8bad6 X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type.
Otherwise we'll get a mix of signed and unsigned compares.
Fixes PR15977.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182364 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 09:58:54 +00:00
Richard Sandiford
af2a1bebfc [SystemZ] Tighten branch tests
After r182274, the branches in these tests must always be short.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182358 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 08:53:17 +00:00
Benjamin Kramer
f19b8b018b DAGCombine: Avoid an edge case where it tried to create an i0 type for (x & 0) == 0.
Fixes PR16083.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182357 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 08:51:09 +00:00
Reed Kotler
49d44a080a Add checks that the proper predeined stubs are being called to the test case.
These were accidentally omitted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182347 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 01:27:36 +00:00
Reed Kotler
bf00bf9ad2 Add some additional functions to the list of helper functions for
pic calls. These need to be there so we don't try and use helper
functions when we call those.

As part of this, make sure that we properly exclude helper functions in pic
mode when indirect calls are involved.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182343 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-21 00:50:30 +00:00
Akira Hatanaka
1aeb13bd9c [mips] Add (setne $lhs, 0) instruction selection pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182307 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 18:18:07 +00:00
Akira Hatanaka
f894199a14 [mips] Trap on integer division by zero.
By default, a teq instruction is inserted after integer divide. No divide-by-zero
checks are performed if option "-mnocheck-zero-division" is used.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182306 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 18:07:43 +00:00
Justin Holewinski
9b39c726a0 [NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182298 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 16:42:18 +00:00
Tom Stellard
4f8d90df45 R600: Fix rotr.ll on non-asserts builds
The -debug-only option is only available on asserts builds.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182291 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:28:48 +00:00
Tom Stellard
0bbfc9313c R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182286 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:24 +00:00
Tom Stellard
ba534c2143 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182285 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:19 +00:00
Tom Stellard
a9d5d0b346 R600/SI: Add patterns for 64-bit shift operations
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182284 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:12 +00:00
Richard Sandiford
44b486ed78 [SystemZ] Add long branch pass
Before this change, the SystemZ backend would use BRCL for all branches
and only consider shortening them to BRC when generating an object file.
E.g. a branch on equal would use the JGE alias of BRCL in assembly output,
but might be shortened to the JE alias of BRC in ELF output.  This was
a useful first step, but it had two problems:

(1) The z assembler isn't traditionally supposed to perform branch shortening
    or branch relaxation.  We followed this rule by not relaxing branches
    in assembler input, but that meant that generating assembly code and
    then assembling it would not produce the same result as going directly
    to object code; the former would give long branches everywhere, whereas
    the latter would use short branches where possible.

(2) Other useful branches, like COMPARE AND BRANCH, do not have long forms.
    We would need to do something else before supporting them.

    (Although COMPARE AND BRANCH does not change the condition codes,
    the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction
    during codegen, so that we can safely lower it to a separate compare
    and long branch where necessary.  This is not a valid transformation
    for the assembler proper to make.)

This patch therefore moves branch relaxation to a pre-emit pass.
For now, calls are still shortened from BRASL to BRAS by the assembler,
although this too is not really the traditional behaviour.

The first test takes about 1.5s to run, and there are likely to be
more tests in this vein once further branch types are added.  The feeling
on IRC was that 1.5s is a bit much for a single test, so I've restricted
it to SystemZ hosts for now.

The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests.
A later patch will remove the {{g}}s from that directory.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182274 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 14:23:08 +00:00
Justin Holewinski
7536ecf291 [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs
This converter currently only handles global variables in address space 0. For
these variables, they are promoted to address space 1 (global memory), and all
uses are updated to point to the result of a cvta.global instruction on the new
variable.

The motivation for this is address space 0 global variables are illegal since we
cannot declare variables in the generic address space.  Instead, we place the
variables in address space 1 and explicitly convert the pointer to address
space 0. This is primarily intended to help new users who expect to be able to
place global variables in the default address space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182254 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 12:13:32 +00:00
Justin Holewinski
55fdf53629 [NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182253 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy
083bc97344 PR15868 fix.
Introduction:
In case when stack alignment is 8 and GPRs parameter part size is not N*8:
we add padding to GPRs part, so part's last byte must be recovered at
address K*8-1.
We need to do it, since remained (stack) part of parameter starts from
address K*8, and we need to "attach" "GPRs head" without gaps to it:

Stack:
|---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
[ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...

FIX:
Note, once we added padding we need to correct *all* Arg offsets that are going
after padded one. That's why we need this fix: Arg offsets were never corrected
before this patch. See new test-cases included in patch.

We also don't need to insert padding for byval parameters that are stored in GPRs
only. We need pad only last byval parameter and only in case it outsides GPRs
and stack alignment = 8.
Though, stack area, allocated for recovered byval params, must satisfy
"Size mod 8 = 0" restriction.

This patch reduces stack usage for some cases:
We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be
"packed" with alignment 4 in some cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182237 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 08:01:34 +00:00
Jakob Stoklund Olesen
89f530ebbf Also expand 64-bit bitcasts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182229 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen
5e5b78ca36 Implement spill and fill of I64Regs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182228 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen
900622e099 Mark i64 SETCC as expand so it is turned into a SELECT_CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182227 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 00:28:36 +00:00
Jakob Stoklund Olesen
634123e98d Don't use %g0 to materialize 0 directly.
The wired physreg doesn't work on tied operands like on MOVXCC.

Add a README note to fix this later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182225 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen
60abcb786e Select i64 values with %icc conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182224 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:38:21 +00:00
Jakob Stoklund Olesen
51d46c36bc Add floating point selects on %xcc predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182222 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen
89db6732fb Implement SPselectfcc for i64 operands.
Also clean up the arguments to all the MOVCC instructions so the
operands always are (true-val, false-val, cond-code).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182221 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju
21886a495a [Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers.
Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182219 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen
00ce0f6512 Handle i64 FrameIndex nodes in SPARC v9 mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182216 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 19:14:24 +00:00
Hal Finkel
bf0bc3b2a2 Check InlineAsm clobbers in PPCCTRLoops
We don't need to reject all inline asm as using the counter register (most does
not). Only those that explicitly clobber the counter register need to prevent
the transformation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182191 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 09:20:39 +00:00
David Majnemer
8a55c2ecd4 X86: Bad peephole interaction between adc, MOV32r0
The peephole tries to reorder MOV32r0 instructions such that they are
before the instruction that modifies EFLAGS.

The problem is that the peephole does not consider the case where the
instruction that modifies EFLAGS also depends on the previous state of
EFLAGS.

Instead, walk backwards until we find an instruction that has a def for
EFLAGS but does not have a use.
If we find such an instruction, insert the MOV32r0 before it.
If it cannot find such an instruction, skip the optimization.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182184 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 01:02:03 +00:00
JF Bastien
bab06ba696 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.

The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).

The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.

I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182175 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 23:49:01 +00:00
Vincent Lejeune
df98ad3959 R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182130 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:51:06 +00:00
Vincent Lejeune
76fc2d077f R600: Use bottom up scheduling algorithm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182129 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:56 +00:00
Vincent Lejeune
21ca0b3ea4 R600: Use depth first scheduling algorithm
It should increase PV substitution opportunities and lower gpr
usage (pending computations path are "flushed" sooner)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182128 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:44 +00:00
Vincent Lejeune
4ed9917147 R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182126 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:32 +00:00
Vincent Lejeune
d3293b49f9 R600: Improve texture handling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182125 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:20 +00:00
Vincent Lejeune
4109bd8829 R600: Rename 128 bit registers.
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182124 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:09 +00:00
Tom Stellard
0976e3c6d9 R600: Fix encoding for R600 family GPUs
Reviewed-by: Vincent Lejeune <vljn@ovi.com>

https://bugs.freedesktop.org/show_bug.cgi?id=64193
https://bugs.freedesktop.org/show_bug.cgi?id=64257
https://bugs.freedesktop.org/show_bug.cgi?id=64320

NOTE: This is a candidate for the 3.3 branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182113 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 15:23:21 +00:00
Venkatraman Govindaraju
a65d33760b [Sparc] Implements hasReservedCallFrame and hasFP.
This is to generate correct framesetup code when the function
 has variable sized allocas.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182108 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 15:14:34 +00:00
Benjamin Kramer
a0de26ce34 X86: Make shuffle -> shift conversion more aggressive about undefs.
Shuffles that only move an element into position 0 of the vector are common in
the output of the loop vectorizer and often generate suboptimal code when SSSE3
is not available. Lower them to vector shifts if possible.

We still prefer palignr over psrldq because it has higher throughput on
sandybridge.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182102 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 14:48:34 +00:00
Benjamin Kramer
c032d1aca0 FileCheckize test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182101 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 14:48:25 +00:00
Venkatraman Govindaraju
d6b4caf291 [Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182063 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 23:53:29 +00:00
Akira Hatanaka
ae7e7cb3d3 [mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr).
Previously, three instructions were needed:

trunc.w.s $f0, $f2
mfc1 $4, $f0
sw $4, 0($2)

Now we need only two:

trunc.w.s $f0, $f2
swc1 $f0, 0($2)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182053 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 21:17:15 +00:00
Rafael Espindola
529874cf0c More test coverage for addFrameMove.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182051 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 20:50:56 +00:00
Hal Finkel
ae06fa2542 Fix cpu on test CodeGen/PowerPC/ctrloop-fp64.ll
We need ppc instead of generic to override native features on ppc machines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182049 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 20:28:05 +00:00
Rafael Espindola
7733728ac2 More addFrameMove test coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182046 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 20:00:45 +00:00
Hal Finkel
c482454e3c Create an new preheader in PPCCTRLoops to avoid counter register clobbers
Some IR-level instructions (such as FP <-> i64 conversions) are not chained
w.r.t. the mtctr intrinsic and yet may become function calls that clobber the
counter register. At the selection-DAG level, these might be reordered with the
mtctr intrinsic causing miscompiles. To avoid this situation, if an existing
preheader has instructions that might use the counter register, create a new
preheader for the mtctr intrinsic. This extra block will be remerged with the
old preheader at the MI level, but will prevent unwanted reordering at the
selection-DAG level.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182045 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 19:58:38 +00:00
Akira Hatanaka
02e168003f [mips] Test case for r182042. Add comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182044 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 19:57:23 +00:00
Rafael Espindola
50f02f9d21 More test coverage for addFrameMove.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182041 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 19:44:40 +00:00
Benjamin Kramer
8401ed21aa DAGCombine: Also shrink eq compares where the constant is exactly as large as the smaller type.
if ((x & 255) == 255)

before: movzbl  %al, %eax
        cmpl  $255, %eax

after:  cmpb  $-1, %al

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182038 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 18:47:58 +00:00
Ulrich Weigand
347a5079e1 [PowerPC] Use true offset value in "memrix" machine operands
This is the second part of the change to always return "true"
offset values from getPreIndexedAddressParts, tackling the
case of "memrix" type operands.

This is about instructions like LD/STD that only have a 14-bit
field to encode immediate offsets, which are implicitly extended
by two zero bits by the machine, so that in effect we can access
16-bit offsets as long as they are a multiple of 4.

The PowerPC back end currently handles such instructions by
carrying the 14-bit value (as it will get encoded into the
actual machine instructions) in the machine operand fields
for such instructions.  This means that those values are
in fact not the true offset, but rather the offset divided
by 4 (and then truncated to an unsigned 14-bit value).

Like in the case fixed in r182012, this makes common code
operations on such offset values not work as expected.
Furthermore, there doesn't really appear to be any strong
reason why we should encode machine operands this way.

This patch therefore changes the encoding of "memrix" type
machine operands to simply contain the "true" offset value
as a signed immediate value, while enforcing the rules that
it must fit in a 16-bit signed value and must also be a
multiple of 4.

This change must be made simultaneously in all places that
access machine operands of this type.  However, just about
all those changes make the code simpler; in many cases we
can now just share the same code for memri and memrix
operands.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182032 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 17:58:02 +00:00
Hal Finkel
2a5e8c328e PPC32 cannot form counter loops around i64 FP conversions
On PPC32, i64 FP conversions are implemented using runtime calls (which clobber
the counter register). These must be excluded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182023 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 16:52:41 +00:00
Rafael Espindola
3e521a5223 Add a triple to the test to try to fix the windows bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182022 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 16:48:46 +00:00
Rafael Espindola
c2d01fd5c2 More addFrameMove test coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182021 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 16:34:38 +00:00
Bill Schmidt
0d6423b476 Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.ll
While testing some experimental code to add vector-scalar registers to
PowerPC, I noticed that a couple of independent instructions were
flipped by the scheduler.  The new CHECK-DAG support is perfect for
avoiding this problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182020 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 16:15:18 +00:00
Rafael Espindola
8da0cebc92 Add more addFrameMove test coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182019 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 16:09:54 +00:00
Rafael Espindola
3808c4d206 Add more test coverage for addFrameMove.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182017 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 15:18:50 +00:00
Ulrich Weigand
f0ef882828 [PowerPC] Report true displacement value from getPreIndexedAddressParts
DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to
decompose a memory address into a base/offset pair.  It expects the
offset (if constant) to be the true displacement value in order to
perform optional additional optimizations; in particular, to convert
other uses of the original pointer into uses of the new base pointer
after pre-increment.

The PowerPC implementation of getPreIndexedAddressParts, however,
simply calls SelectAddressRegImm, which returns a TargetConstant.
This value is appropriate for encoding into the instruction, but
it is not always usable as true displacement value:

- Its type is always MVT::i32, even on 64-bit, where addresses
  ought to be i64 ... this causes the optimization to simply
  always fail on 64-bit due to this line in DAGCombiner:

      // FIXME: In some cases, we can be smarter about this.
      if (Op1.getValueType() != Offset.getValueType()) {

- Its value is truncated to an unsigned 16-bit value if negative.
  This causes the above opimization to generate wrong code.

This patch fixes both problems by simply returning the true
displacement value (in its original type).  This doesn't
affect any other user of the displacement.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182012 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 14:53:05 +00:00
Rafael Espindola
26bca5816d Add more addFrameMove test coverage.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182011 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 14:51:26 +00:00
Rafael Espindola
aba2d6d051 Extend test to check the .cfi instructions.
I am about to refactor the calls to addFrameMove and some of the ppc
ones were not being tested.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182009 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 14:30:09 +00:00
Benjamin Kramer
d37635d2f2 Relax CHECK-NEXTs a bit to cope with atom's return nop padding.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181999 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 11:46:50 +00:00
Rafael Espindola
0225d5a3af Extend test for better coverage.
Without this change nothing was covering this addFrameMove:

// For 64-bit SVR4 when we have spilled CRs, the spill location
// is SP+8, not a frame-relative slot.
if (Subtarget.isSVR4ABI()
    && Subtarget.isPPC64()
    && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) {
  MachineLocation CSDst(PPC::X1, 8);
  MachineLocation CSSrc(PPC::CR2);
  MMI.addFrameMove(Label, CSDst, CSSrc);
  continue;
}

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181976 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 03:48:50 +00:00
Reed Kotler
1a2265bc01 Patch number 2 for mips16/32 floating point interoperability stubs.
This creates stubs that help Mips32 functions call Mips16 
functions which have floating point parameters that are normally passed
in floating point registers.
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181972 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-16 02:17:42 +00:00
David Majnemer
55a6f111fc Set an explicit triple for this test.
This allows the test to correctly check symbol names.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181939 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-15 22:23:21 +00:00
David Majnemer
17585dc4d4 X86: Remove redundant test instructions
Increase the number of instructions LLVM recognizes as setting the ZF
flag. This allows us to remove test instructions that redundantly
recalculate the flag.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181937 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-15 22:03:08 +00:00
Hal Finkel
b1fd3cd78f Implement PPC counter loops as a late IR-level pass
The old PPCCTRLoops pass, like the Hexagon pass version from which it was
derived, could only handle some simple loops in canonical form. We cannot
directly adapt the new Hexagon hardware loops pass, however, because the
Hexagon pass contains a fundamental assumption that non-constant-trip-count
loops will contain a guard, and this is not always true (the result being that
incorrect negative counts can be generated). With this commit, we replace the
pass with a late IR-level pass which makes use of SE to calculate the
backedge-taken counts and safely generate the loop-count expressions (including
any necessary max() parts). This IR level pass inserts custom intrinsics that
are lowered into the desired decrement-and-branch instructions.

The most fragile part of this new implementation is that interfering uses of
the counter register must be detected on the IR level (and, on PPC, this also
includes any indirect branches in addition to function calls). Also, to make
all of this work, we need a variant of the mtctr instruction that is marked
as having side effects. Without this, machine-code level CSE, DCE, etc.
illegally transform the resulting code. Hopefully, this can be improved
in the future.

This new pass is smaller than the original (and much smaller than the new
Hexagon hardware loops pass), and can handle many additional cases correctly.
In addition, the preheader-creation code has been copied from LoopSimplify, and
after we decide on where it belongs, this code will be refactored so that it
can be explicitly shared (making this implementation even smaller).

The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for
the new Hexagon pass. There are a few classes of loops that this pass does not
transform (noted by FIXMEs in the files), but these deficiencies can be
addressed within the SE infrastructure (thus helping many other passes as well).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181927 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-15 21:37:41 +00:00
Derek Schuff
c22cdb7203 Fix miscompile due to StackColoring incorrectly merging stack slots (PR15707)
IR optimisation passes can result in a basic block that contains:

  llvm.lifetime.start(%buf)
  ...
  llvm.lifetime.end(%buf)
  ...
  llvm.lifetime.start(%buf)

Before this change, calculateLiveIntervals() was ignoring the second
lifetime.start() and was regarding %buf as being dead from the
lifetime.end() through to the end of the basic block.  This can cause
StackColoring to incorrectly merge %buf with another stack slot.

Fix by removing the incorrect Starts[pos].isValid() and
Finishes[pos].isValid() checks.

Just doing:
      Starts[pos] = Indexes->getMBBStartIdx(MBB);
      Finishes[pos] = Indexes->getMBBEndIdx(MBB);
unconditionally would be enough to fix the bug, but it causes some
test failures due to stack slots not being merged when they were
before.  So, in order to keep the existing tests passing, treat LiveIn
and LiveOut separately rather than approximating the live ranges by
merging LiveIn and LiveOut.

This fixes PR15707.
Patch by Mark Seaborn.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181922 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-15 21:15:09 +00:00
Richard Sandiford
ddbf053a4c [SystemZ] Make use of SUBTRACT HALFWORD
Thanks to Ulrich Weigand for noticing that this instruction was missing.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181893 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-15 15:05:29 +00:00
Arnold Schwaighofer
101a36117c ARM ISel: Don't create illegal types during LowerMUL
The transformation happening here is that we want to turn a
"mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have
to make sure that X still has a valid vector type - possibly recreate an
extension to a smaller type. In case of a extload of a memory type smaller than
64 bit we used create a ext(load()). The problem with doing this - instead of
recreating an extload - is that an illegal type is exposed.

This patch fixes this by creating extloads instead of ext(load()) sequences.

Fixes PR15970.

radar://13871383

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181842 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 22:33:24 +00:00
Jyotsna Verma
a29a8965e2 Hexagon: Pass to replace tranfer/copy instructions into combine instruction
where possible.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181817 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 18:54:06 +00:00
Eric Christopher
f276c70bb8 Reapply "Subtract isn't commutative, fix this for MMX psub." with
a somewhat randomly chosen cpu that will minimize cpu specific
differences on bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181814 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 18:33:40 +00:00
Eric Christopher
edf0dda528 Temporarily revert "Subtract isn't commutative, fix this for MMX psub."
It's causing failures on the atom bot.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181812 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 18:20:42 +00:00
Eric Christopher
304d73c9ee Subtract isn't commutative, fix this for MMX psub.
Patch by Andrea DiBiagio.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181809 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 17:52:05 +00:00
Jakob Stoklund Olesen
bc3db03bf0 Recognize sparc64 as an alias for sparcv9 triples.
Patch by Brad Smith!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181808 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 17:47:27 +00:00
Jyotsna Verma
36e1b51438 Hexagon: Add patterns to generate 'combine' instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181805 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 17:16:38 +00:00
Jyotsna Verma
91eadc6d69 Hexagon: ArePredicatesComplement should not restrict itself to TFRs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181803 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 16:36:34 +00:00
Derek Schuff
ed788b6283 Fix ARM FastISel tests, as a first step to enabling ARM FastISel
ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on
enabling it for other targets. As a first step I've fixed some of the tests.
Changes to ARM FastISel tests:
- Different triples don't generate the same relocations (especially
  movw/movt versus constant pool loads). Use a regex to allow either.
- Mangling is different. Use a regex to allow either.
- The reserved registers are sometimes different, so registers get
  allocated in a different order. Capture the names only where this
  occurs.
- Add -verify-machineinstrs to some tests where it works. It doesn't
  work everywhere it should yet.
- Add -fast-isel-abort to many tests that didn't have it before.
- Split out the VarArg test from fast-isel-call.ll into its own
  test. This simplifies test setup because of --check-prefix.

Patch by JF Bastien

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181801 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 16:26:38 +00:00
Bill Schmidt
ded53bf4dd PPC32: Fix stack collision between FP and CR save areas.
The changes to CR spill handling missed a case for 32-bit PowerPC.
The code in PPCFrameLowering::processFunctionBeforeFrameFinalized()
checks whether CR spill has occurred using a flag in the function
info.  This flag is only set by storeRegToStackSlot and
loadRegFromStackSlot.  spillCalleeSavedRegisters does not call
storeRegToStackSlot, but instead produces MI directly.  Thus we don't
see the CR is spilled when assigning frame offsets, and the CR spill
ends up colliding with some other location (generally the FP slot).

This patch sets the flag in spillCalleeSavedRegisters for PPC32 so
that the CR spill is properly detected and gets its own slot in the
stack frame.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181800 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 16:08:32 +00:00
Jyotsna Verma
21e6ea54ff Hexagon: Test case to check if branch probabilities are properly reflected in
the jump instructions in the form of taken/not-taken hint.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181799 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 15:50:49 +00:00
Michel Danzer
5096dc74ae R600/SI: Add lit test coverage for the remaining patterns added recently
Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181775 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 09:53:30 +00:00
Reed Kotler
eafa96485a This is the first of three patches which creates stubs used for
Mips16/32 floating point interoperability.

When Mips16 code calls external functions that would normally have some
of its parameters or return values passed in floating point registers,
it needs (Mips32) helper functions to do this because while in Mips16 mode
there is no ability to access the floating point registers.

In Pic mode, this is done with a set of predefined functions in libc.
This case is already handled in llvm for Mips16.

In static relocation mode, for efficiency reasons, the compiler generates
stubs that the linker will use if it turns out that the external function
is a Mips32 function. (If it's Mips16, then it does not need the helper
stubs).

These stubs are identically named and the linker knows about these tricks
and will not create multiple copies and will delete them if they are not
needed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181753 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 02:00:24 +00:00
Akira Hatanaka
dd29df06fa StackColoring: don't clear an instruction's mem operand if the underlying
object is a PseudoSourceValue and PseudoSourceValue::isConstant returns true (i.e.,
points to memory that has a constant value).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181751 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-14 01:42:44 +00:00
Bill Schmidt
240b9b6078 PPC64: Constant initializers with dynamic relocations go in .data.rel.ro.
This fixes warning messages observed in the oggenc application test in
projects/test-suite.  Special handling is needed for the 64-bit
PowerPC SVR4 ABI when a constant is initialized with a pointer to a
function in a shared library.  Because a function address is
implemented as the address of a function descriptor, the use of copy
relocations can lead to problems with initialization.  GNU ld
therefore replaces copy relocations with dynamic relocations to be
resolved by the dynamic linker.  This means the constant cannot reside
in the read-only data section, but instead belongs in .data.rel.ro,
which is designed for constants containing dynamic relocations.

The implementation creates a class PPC64LinuxTargetObjectFile
inheriting from TargetLoweringObjectFileELF, which behaves like its
parent except to place constants of this sort into .data.rel.ro.

The test case is reduced from the oggenc application.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181723 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13 19:34:37 +00:00
Akira Hatanaka
42f562a169 [mips] Add option -mno-ldc1-sdc1.
This option is used when the user wants to avoid emitting double precision FP
loads and stores. Double precision FP loads and stores are expanded to single
precision instructions after register allocation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181718 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13 18:23:35 +00:00
Lang Hames
d26c93d3a8 Correctly preserve the input chain for potential tailcall nodes whose
return values are bitcasts.

The chain had previously been being clobbered with the entry node to
the dag, which sometimes caused other code in the function to be
erroneously deleted when tailcall optimization kicked in.

<rdar://problem/13827621>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181696 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13 10:21:19 +00:00
Hao Liu
3778c04b2e Fix PR15950 A bug in DAG Combiner about undef mask
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181682 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-13 02:07:05 +00:00
Reed Kotler
a6e31cf69e Add -mtriple=mipsel-linux-gnu to the test so that the compiler does
not think it can support small data sections.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181654 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-11 01:02:20 +00:00
Reed Kotler
46090914b7 Checkin in of first of several patches to finish implementation of
mips16/mips32 floating point interoperability. 

This patch fixes returns from mips16 functions so that if the function
was in fact called by a mips32 hard float routine, then values
that would have been returned in floating point registers are so returned.

Mips16 mode has no floating point instructions so there is no way to
load values into floating point registers.

This is needed when returning float, double, single complex, double complex
in the Mips ABI.

Helper functions in libc for mips16 are available to do this.

For efficiency purposes, these helper functions have a different calling
convention from normal Mips calls.

Registers v0,v1,a0,a1 are used to pass parameters instead of
a0,a1,a2,a3.

This is because v0,v1,a0,a1 are the natural registers used to return
floating point values in soft float. These values can then be moved
to the appropriate floating point registers with no extra cost.

The only register that is modified is ra in this call.

The helper functions make sure that the return values are in the floating
point registers that they would be in if soft float was not in effect
(which it is for mips16, though the soft float is implemented using a mips32
library that uses hard float).
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181641 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-10 22:25:39 +00:00