61632 Commits

Author SHA1 Message Date
Hal Finkel
e50c8c1f81 Add a PPCCTRLoops verification pass
When asserts are enabled, this adds a verification pass for PPC counter-loop
formation. Unfortunately, without sacrificing code quality, there is no better
way of forming counter-based loops except at the (late) IR level. This means
that we need to recognize, at the IR level, anything which might turn into a
function call (or indirect branch). Because this is currently a finite set of
things, and because SelectionDAG lowering is basic-block local, this can be
done. Nevertheless, it is fragile, and failure results in a miscompile. This
verification pass checks that all (reachable) counter-based branches are
dominated by a loop mtctr instruction, and that no instructions in between
clobber the counter register. If these conditions are not satisfied, then an
ICE will be triggered.

In short, this is to help us sleep better at night.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182295 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 16:08:17 +00:00
Benjamin Kramer
d54bf7ceef R600: Fix bug detected by GCC warning.
R600TextureIntrinsicsReplacer.cpp:232: warning: the address of ‘ArgsType’ will always evaluate as ‘true’

This doesn't have any effect on the output as a vararg intrinsic behaves the
same way as a non-vararg one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182293 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:58:43 +00:00
Tom Stellard
5ad6d082fc R600/SI: Use a multiclass for MUBUF_Load_Helper
This will simplify the instructions and also the pattern definitions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182288 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:31 +00:00
Tom Stellard
307a84425c R600/SI: Add a pattern for S_LOAD_DWORDX2_* instructions
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182287 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:28 +00:00
Tom Stellard
0bbfc9313c R600/SI: Add pattern for rotr
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182286 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:24 +00:00
Tom Stellard
ba534c2143 R600: Swap the legality of rotl and rotr
The hardware supports rotr and not rotl.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182285 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:19 +00:00
Tom Stellard
a9d5d0b346 R600/SI: Add patterns for 64-bit shift operations
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182284 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:12 +00:00
Tom Stellard
fbdd318120 R600/SI: Use the same names for VOP3 operands and encoding fields
This makes it possible to reorder the operands without breaking the
encoding.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182283 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:08 +00:00
Tom Stellard
9bf4590aaa R600/SI: Make fitsRegClass() operands const
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182282 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 15:02:01 +00:00
Mihai Popa
30a7a7c1fd VSTn instructions have a number of encoding constraints which are not implemented. I have added these using wrapper methods around the original custom decoder (incidentally - this is a huge poorly written method that should be cleaned up. I have left it as is since the changes would be much to hard to review).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182281 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 14:57:05 +00:00
Mihai Popa
bac932e9c3 Q registers are encoded in fields of the same length as D registers. As Q registers are half as many, the ARM reference manual mandates the least significant bit to be zeroed out. Failure to do so should result in an undefined instruction. With this change test/MC/Disassembler/ARM/invalid-VQADD-arm.txt is passing (removed XFAIL).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182279 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 14:42:43 +00:00
Richard Sandiford
44b486ed78 [SystemZ] Add long branch pass
Before this change, the SystemZ backend would use BRCL for all branches
and only consider shortening them to BRC when generating an object file.
E.g. a branch on equal would use the JGE alias of BRCL in assembly output,
but might be shortened to the JE alias of BRC in ELF output.  This was
a useful first step, but it had two problems:

(1) The z assembler isn't traditionally supposed to perform branch shortening
    or branch relaxation.  We followed this rule by not relaxing branches
    in assembler input, but that meant that generating assembly code and
    then assembling it would not produce the same result as going directly
    to object code; the former would give long branches everywhere, whereas
    the latter would use short branches where possible.

(2) Other useful branches, like COMPARE AND BRANCH, do not have long forms.
    We would need to do something else before supporting them.

    (Although COMPARE AND BRANCH does not change the condition codes,
    the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction
    during codegen, so that we can safely lower it to a separate compare
    and long branch where necessary.  This is not a valid transformation
    for the assembler proper to make.)

This patch therefore moves branch relaxation to a pre-emit pass.
For now, calls are still shortened from BRASL to BRAS by the assembler,
although this too is not really the traditional behaviour.

The first test takes about 1.5s to run, and there are likely to be
more tests in this vein once further branch types are added.  The feeling
on IRC was that 1.5s is a bit much for a single test, so I've restricted
it to SystemZ hosts for now.

The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests.
A later patch will remove the {{g}}s from that directory.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182274 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 14:23:08 +00:00
Justin Holewinski
7536ecf291 [NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs
This converter currently only handles global variables in address space 0. For
these variables, they are promoted to address space 1 (global memory), and all
uses are updated to point to the result of a cvta.global instruction on the new
variable.

The motivation for this is address space 0 global variables are illegal since we
cannot declare variables in the generic address space.  Instead, we place the
variables in address space 1 and explicitly convert the pointer to address
space 0. This is primarily intended to help new users who expect to be able to
place global variables in the default address space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182254 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 12:13:32 +00:00
Justin Holewinski
55fdf53629 [NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182253 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy
083bc97344 PR15868 fix.
Introduction:
In case when stack alignment is 8 and GPRs parameter part size is not N*8:
we add padding to GPRs part, so part's last byte must be recovered at
address K*8-1.
We need to do it, since remained (stack) part of parameter starts from
address K*8, and we need to "attach" "GPRs head" without gaps to it:

Stack:
|---- 8 bytes block ----| |---- 8 bytes block ----| |---- 8 bytes...
[ [padding] [GPRs head] ] [ ------ Tail passed via stack  ------ ...

FIX:
Note, once we added padding we need to correct *all* Arg offsets that are going
after padded one. That's why we need this fix: Arg offsets were never corrected
before this patch. See new test-cases included in patch.

We also don't need to insert padding for byval parameters that are stored in GPRs
only. We need pad only last byval parameter and only in case it outsides GPRs
and stack alignment = 8.
Though, stack area, allocated for recovered byval params, must satisfy
"Size mod 8 = 0" restriction.

This patch reduces stack usage for some cases:
We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be
"packed" with alignment 4 in some cases.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182237 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 08:01:34 +00:00
Jakob Stoklund Olesen
89f530ebbf Also expand 64-bit bitcasts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182229 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen
5e5b78ca36 Implement spill and fill of I64Regs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182228 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen
900622e099 Mark i64 SETCC as expand so it is turned into a SELECT_CC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182227 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-20 00:28:36 +00:00
Benjamin Kramer
4dc8bdf87d Replace some bit operations with simpler ones. No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182226 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 22:01:57 +00:00
Jakob Stoklund Olesen
634123e98d Don't use %g0 to materialize 0 directly.
The wired physreg doesn't work on tied operands like on MOVXCC.

Add a README note to fix this later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182225 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen
60abcb786e Select i64 values with %icc conditions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182224 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:38:21 +00:00
Bob Wilson
2ed2ad00f9 Remove declaration of __clear_cache for __APPLE__. <rdar://problem/13924072>
This fixes a bootstrapping problem with builds for Apple ARM targets.
Clang had the wrong prototype for __clear_cache with ARM targets.  Rafael
fixed that in clang svn r181784 and r181810, but without those changes,
we can't build this code for ARM because clang reports an error about the
declaration in Memory.inc not matching the builtin declaration. Some of our
buildbots need to use an older compiler that doesn't have the clang fix.
Since __clear_cache is never used here when __APPLE__ is defined, I'm just
conditionalizing the declaration to match that. I also moved the declaration
of sys_icache_invalidate inside the conditional for __APPLE__ while I was at
it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182223 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:33:51 +00:00
Jakob Stoklund Olesen
51d46c36bc Add floating point selects on %xcc predicates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182222 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen
89db6732fb Implement SPselectfcc for i64 operands.
Also clean up the arguments to all the MOVCC instructions so the
operands always are (true-val, false-val, cond-code).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182221 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju
21886a495a [Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers.
Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182219 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen
00ce0f6512 Handle i64 FrameIndex nodes in SPARC v9 mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182216 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 19:14:24 +00:00
Tim Northover
675b9e9f3d AArch64: make RuntimeDyld relocations idempotent
AArch64 ELF uses .rela relocations so there's no need to actually make
use of the bits we're setting in the destination  However, we should
make sure all bits are cleared properly since multiple runs of
resolveRelocations are possible and these could combine to produce
invalid results if stale versions remain in the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182214 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 15:39:03 +00:00
Tim Northover
820b147493 Invalidate instruction cache when setting memory to be executable.
lli's remote MCJIT code calls setExecutable just prior to running
code. In line with Darwin behaviour this seems to be the place to
invalidate any caches needed so that relocations can take effect
properly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182213 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-19 15:28:16 +00:00
David Majnemer
cb9d4667b7 isKnownToBeAPowerOfTwo: (X & Y) + Y is a power of 2 or zero if y is also.
This is useful if something that looks like (x & (1 << y)) ? 64 : 32 is
the divisor in a modulo operation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182200 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 19:30:37 +00:00
Arnold Schwaighofer
688b5103eb LoopVectorize: Handle single edge PHIs
We might encouter single edge PHIs - handle them with an identity select.

Fixes PR15990.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182199 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 18:38:34 +00:00
Hal Finkel
bf0bc3b2a2 Check InlineAsm clobbers in PPCCTRLoops
We don't need to reject all inline asm as using the counter register (most does
not). Only those that explicitly clobber the counter register need to prevent
the transformation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182191 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 09:20:39 +00:00
Tim Northover
9f61e485e6 AArch64: add CMake dependency to fix very parallel builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182190 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 08:17:47 +00:00
David Majnemer
8a55c2ecd4 X86: Bad peephole interaction between adc, MOV32r0
The peephole tries to reorder MOV32r0 instructions such that they are
before the instruction that modifies EFLAGS.

The problem is that the peephole does not consider the case where the
instruction that modifies EFLAGS also depends on the previous state of
EFLAGS.

Instead, walk backwards until we find an instruction that has a def for
EFLAGS but does not have a use.
If we find such an instruction, insert the MOV32r0 before it.
If it cannot find such an instruction, skip the optimization.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182184 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 01:02:03 +00:00
Matt Arsenault
24623dcd24 Remove duplicated comment
The same comment is already made in the header

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182181 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 00:24:09 +00:00
Matt Arsenault
225ed7069c Add LLVMContext argument to getSetCCResultType
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182180 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-18 00:21:46 +00:00
JF Bastien
bab06ba696 Support unaligned load/store on more ARM targets
This patch matches GCC behavior: the code used to only allow unaligned
load/store on ARM for v6+ Darwin, it will now allow unaligned load/store
for v6+ Darwin as well as for v7+ on Linux and NaCl.

The distinction is made because v6 doesn't guarantee support (but LLVM
assumes that Apple controls hardware+kernel and therefore have
conformant v6 CPUs), whereas v7 does provide this guarantee (and
Linux/NaCl behave sanely).

The patch keeps the -arm-strict-align command line option, and adds
-arm-no-strict-align. They behave similarly to GCC's -mstrict-align and
-mnostrict-align.

I originally encountered this discrepancy in FastIsel tests which expect
unaligned load/store generation. Overall this should slightly improve
performance in most cases because of reduced I$ pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182175 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 23:49:01 +00:00
Rafael Espindola
2bbe378147 Convert obj2yaml to use yamlio.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182169 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 22:58:42 +00:00
Rafael Espindola
eee6cdd781 Fix the build in c++11 mode.
The errors were:

non-constant-expression cannot be narrowed from type 'int64_t' (aka 'long') to 'uint32_t' (aka 'unsigned int') in initializer list

and

non-constant-expression cannot be narrowed from type 'long' to 'uint32_t' (aka 'unsigned int') in initializer list

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182168 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 22:45:52 +00:00
Matt Arsenault
9aa8fdfddb Replace redundant code
Use EVT::changeExtendedVectorElementTypeToInteger instead of doing the
same thing that it does

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182165 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 21:43:43 +00:00
Matt Arsenault
63f3ca5da7 Add missing -*- C++ -*- to headers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182164 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 21:43:39 +00:00
Vincent Lejeune
df98ad3959 R600: Lower int_load_input to copyFromReg instead of Register node
It solves a bug uncovered by dot4 patch where the register class of
int_load_input use was ignored.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182130 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:51:06 +00:00
Vincent Lejeune
76fc2d077f R600: Use bottom up scheduling algorithm
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182129 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:56 +00:00
Vincent Lejeune
21ca0b3ea4 R600: Use depth first scheduling algorithm
It should increase PV substitution opportunities and lower gpr
usage (pending computations path are "flushed" sooner)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182128 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:44 +00:00
Vincent Lejeune
f63f85affa R600: Replace big texture opcode switch in scheduler by usesTC/usesVC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182127 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:37 +00:00
Vincent Lejeune
4ed9917147 R600: Relax some vector constraints on Dot4.
Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register
coalescer to remove some unneeded COPY.
This patch also defines some structures/functions that can be used to handle
every vector instructions (CUBE, Cayman special instructions...) in a similar
fashion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182126 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:32 +00:00
Vincent Lejeune
d3293b49f9 R600: Improve texture handling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182125 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:20 +00:00
Vincent Lejeune
4109bd8829 R600: Rename 128 bit registers.
Almost all instructions that takes a 128 bits reg as input (fetch, export...)
have the abilities to swizzle their argument and output. Instead of printing
default swizzle for each 128 bits reg, rename T*.XYZW to T* and let instructions
print potentially optimized swizzles themselves.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182124 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:09 +00:00
Vincent Lejeune
25c209e9a2 R600: Some factorization
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182123 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:50:02 +00:00
Vincent Lejeune
dcfcf1d1ff R600: Factorize Fetch size limit inside AMDGPUSubTarget
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182122 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:49:55 +00:00
Vincent Lejeune
9a9e936650 R600: prettier dump of clamp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182121 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-17 16:49:49 +00:00