llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-25 00:29:20 +00:00

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	13f5c5896d	verify-uselistorder: Force -preserve-bc-use-list-order git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216022 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 21:08:27 +00:00
Juergen Ributzka	1d58f989d4	[FastISel][AArch64] Extend floating-point materialization test. This adds the missing test that I promised for r215753 to test the materialization of the floating-point value +0.0. Related to <rdar://problem/18027157>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216019 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 20:35:07 +00:00
Juergen Ributzka	06bb1ca1e0	Reapply [FastISel][AArch64] Add support for more addressing modes (r215597). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216013 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:44:17 +00:00
Juergen Ributzka	96b1e70c66	Reapply [FastISel][X86] Add large code model support for materializing floating-point constants (r215595). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: In the large code model for X86 floating-point constants are placed in the constant pool and materialized by loading from it. Since the constant pool could be far away, a PC relative load might not work. Therefore we first materialize the address of the constant pool with a movabsq and then load from there the floating-point value. Fixes <rdar://problem/17674628>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216012 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:44:13 +00:00
Juergen Ributzka	9c23685dd2	Reapply [FastISel][X86] Use XOR to materialize the "0" value (r215594). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216011 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:44:10 +00:00
Juergen Ributzka	e8757c5dbb	Reapply [FastISel][X86] Emit more efficient instructions for integer constant materialization (r215593). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: This mostly affects the i64 value type, which always resulted in an 15byte mobavsq instruction to materialize any constant. The custom code checks the value of the immediate and tries to use a different and smaller mov instruction when possible. This fixes <rdar://problem/17420988>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216010 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:44:06 +00:00
Juergen Ributzka	78f686d37c	Reapply [FastISel][AArch64] Make use of the zero register when possible (r215591). Note: This was originally reverted to track down a buildbot error. Reapply without any modifications. Original commit message: This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216009 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:44:02 +00:00
Juergen Ributzka	f08cddcf56	Reapply [FastISel] Let the target decide first if it wants to materialize a constant (215588). Note: This was originally reverted to track down a buildbot error. This commit exposed a latent bug that was fixed in r215753. Therefore it is reapplied without any modifications. I run it through SPEC2k and SPEC2k6 for AArch64 and it didn't introduce any new regeressions. Original commit message: This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code. On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too. On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register. On ARM it would generate unnecessary mov instructions or not use mvn. This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations. Related to <rdar://problem/17420988>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216006 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 19:05:24 +00:00
Renato Golin	8308f0e30f	Revert "Small refactor on VectorizerHint for deduplication" This reverts commit r215994 because MSVC 2012 can't cope with its C++11 goodness. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215999 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 18:08:50 +00:00
Juergen Ributzka	8841fb5f25	[FastISel][AArch64] Fix a few BuildMI callsites where the result register was added as an operand register. This fixes a few BuildMI callsites where the result register was added by using addReg, which is per default a use and therefore an operand register. Also use the zero register as result register when emitting a compare instruction (SUBS with unused result register). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 17:41:53 +00:00
Renato Golin	dca126522d	Small refactor on VectorizerHint for deduplication Previously, the hint mechanism relied on clean up passes to remove redundant metadata, which still showed up if running opt at low levels of optimization. That also has shown that multiple nodes of the same type, but with different values could still coexist, even if temporary, and cause confusion if the next pass got the wrong value. This patch makes sure that, if metadata already exists in a loop, the hint mechanism will never append a new node, but always replace the existing one. It also enhances the algorithm to cope with more metadata types in the future by just adding a new type, not a lot of code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215994 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 17:30:43 +00:00
Toma Tabacu	109447ff1b	[mips] Add assembler support for .set arch=x directive. Summary: This directive is similar to ".set mipsX". It is used to change the CPU target of the assembler, enabling it to accept instructions for a specific CPU. This patch only implements the r4000 CPU (which is treated internally as generic mips3) and the generic ISAs. Contains work done by Matheus Almeida. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D4884 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215978 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 14:22:52 +00:00
Mayur Pandey	ecdb0ab90f	InstCombine: ((A & ~B) ^ (~A & B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A & ~B),(~A & B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Differential Revision: http://reviews.llvm.org/D4898 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215974 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 08:19:19 +00:00
Akira Hatanaka	6290308366	[X86, X87 stackifier] Do not mark an operand of a debug instruction as kill. <rdar://problem/16952634> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215962 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-19 02:09:57 +00:00
Duncan P. N. Exon Smith	36c150801a	verify-uselistorder: Call verifyModule() and improve output Call `verifyModule()` after parsing and after every transformation. Also convert some `DEBUG(dbgs())` to `errs()` to increase visibility into what's going on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215951 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 23:44:14 +00:00
Robin Morisset	0acd42142a	Answer to Philip Reames comments - add check for volatile (probably unneeded, but I agree that we should be conservative about it). - strengthen condition from isUnordered() to isSimple(), as I don't understand well enough Unordered semantics (and it also matches the comment better this way) to be confident in the previous behaviour (thanks for catching that one, I had missed the case Monotonic/Unordered). - separate a condition in two. - lengthen comment about aliasing and loads - add tests in GVN/atomic.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215943 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 22:18:14 +00:00
Robin Morisset	6c0e1e0fa6	Weak relaxing of the constraints on atomics in MemoryDependencyAnalysis Monotonic accesses do not have to kill the analysis, as long as the QueryInstr is not itself atomic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215942 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 22:18:11 +00:00
Kevin Enderby	f759032ccd	Make llvm-objdump handle both arm and thumb disassembly from the same Mach-O file with -macho, the Mach-O specific object file parser option. After some discussion I chose to do this implementation contained in the logic of llvm-objdump’s MachODump.cpp using a second disassembler for thumb when needed and with updates mostly contained in the MachOObjectFile class. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215931 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 20:21:02 +00:00
Oliver Stannard	eb922109f9	Teach the AArch64 backend to handle f16 This allows the AArch64 backend to handle fadd, fsub, fmul and fdiv operations on f16 (half-precision) types by promoting to f32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215891 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 14:22:39 +00:00
Oliver Stannard	802d420792	[ARM,AArch64] Do not tail-call to an externally-defined function with weak linkage Externally-defined functions with weak linkage should not be tail-called on ARM or AArch64, as the AAELF spec requires normal calls to undefined weak functions to be replaced with a NOP or jump to the next instruction. The behaviour of branch instructions in this situation (as used for tail calls) is implementation-defined, so we cannot rely on the linker replacing the tail call with a return. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215890 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 12:42:15 +00:00
Elena Demikhovsky	9735ccb7ea	AVX-512: Fixed a bug in emitting compare for MVT:i1 type. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215889 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-18 11:59:06 +00:00
Saleem Abdulrasool	f15492fd72	ARM: improve RTABI 4.2 conformance on Linux The set of functions defined in the RTABI was separated for no real reason. This brings us closer to proper utilisation of the functions defined by the RTABI. It also sets the ground for correctly emitting function calls to AEABI functions on all AEABI conforming platforms. The previously existing lie on the behaviour of __ldivmod and __uldivmod is propagated as it is beyond the scope of the change. The changes to the test are due to the fact that we now use the divmod functions which return both the quotient and remainder and thus we no longer need to invoke two functions on Linux (making it closer to EABI's behaviour). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 22:51:02 +00:00
Daniel Sanders	5535fca8bf	Revert: r215698 - Current implementation of c.cond.fmt instructions only accept default cc0 register... It causes a number of regressions when -fintegrated-as is enabled. This happens because there are codegen-only instructions that incorrectly uses the first operand as the encoding for the $fcc register. The regressions do not occur when -via-file-asm is also given. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 19:47:47 +00:00
Saleem Abdulrasool	70d641fbec	ARM: correct toggling behaviour This was a thinko. The intent was to flip the explicit bits that need toggling rather than all bits. This would result in incorrect behaviour (which now is tested). Thanks to Nico Weber for pointing this out! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215846 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 19:20:38 +00:00
Rafael Espindola	78e8d52a58	llvm-objdump: don't print relocations in non-relocatable files. This matches the behavior of GNU objdump. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215844 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 19:09:37 +00:00
Rafael Espindola	4c511cd88f	Fix an off-by-one bug in the target independent llvm-objdump. It would prevent the display of a single byte instruction before a label. Patch by Steve King! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215837 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 16:31:39 +00:00
Owen Anderson	7a0201c6a6	Remove an InstCombine that transformed patterns like (x * uitofp i1 y) to (select y, x, 0.0) when the multiply has fast math flags set. While this might seem like an obvious canonicalization, there is one subtle problem with it. The result of the original expression is undef when x is NaN (remember, fast math flags), but the result of the select is always defined when x is NaN. This means that the new expression is strictly more defined than the original one. One unfortunate consequence of this is that the transform is not reversible! It's always legal to make increase the defined-ness of an expression, but it's not legal to reduce it. Thus, targets that prefer the original form of the expression cannot reverse the transform to recover it. Another way to think of it is that the transform has lost source-level information (the fast math flags), which is undesirable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215825 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-17 03:51:29 +00:00
NAKAMURA Takumi	0ac5626b56	llvm/test/CodeGen/X86/fmul-combines.ll: Appease Windows x64. <4 x float> is passed by stack. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215821 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 22:28:37 +00:00
Matt Arsenault	5f8a9ae17c	Fix fmul combines with constant splat vectors Fixes things like fmul x, 2 -> fadd x, x git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215820 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 10:14:19 +00:00
Chandler Carruth	a3805f1c73	[x86] Teach lots of the new vector shuffle lowering to use UNPCK instructions for blend operations at 128 bits. This was a serious hole in our prior blend lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215819 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 09:42:15 +00:00
David Majnemer	cb698b26a1	InstCombine: Combine mul with div. We can combne a mul with a div if one of the operands is a multiple of the other: %mul = mul nsw nuw %a, C1 %ret = udiv %mul, C2 => %ret = mul nsw %a, (C1 / C2) This can expose further optimization opportunities if we end up multiplying or dividing by a power of 2. Consider this small example: define i32 @f(i32 %a) { %mul = mul nuw i32 %a, 14 %div = udiv exact i32 %mul, 7 ret i32 %div } which gets CodeGen'd to: imull $14, %edi, %eax imulq $613566757, %rax, %rcx shrq $32, %rcx subl %ecx, %eax shrl %eax addl %ecx, %eax shrl $2, %eax retq We can now transform this into: define i32 @f(i32 %a) { %shl = shl nuw i32 %a, 1 ret i32 %shl } which gets CodeGen'd to: leal (%rdi,%rdi), %eax retq This fixes PR20681. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215815 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 08:55:06 +00:00
Nico Weber	f1aba61bb5	arm asm: Let .fpu enable instructions, PR20447. I'm not very happy with duplicating the fpu->feature mapping in ARMAsmParser.cpp and in clang's driver. See the bug for a patch that doesn't do that, and the review thread [1] for why this duplication exists. 1: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20140811/231052.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 05:37:51 +00:00
Duncan P. N. Exon Smith	1d8c9d95bf	BitcodeReader: Only create one basic block for each blockaddress Block address forward-references are implemented by creating a `BasicBlock` ahead of time that gets inserted in the `Function` when it's eventually encountered. However, if the same blockaddress was used in two separate functions that were parsed before the referenced function (and the blockaddress was never used at global scope), two separate basic blocks would get created, one of which would be forgotten creating invalid IR. This commit changes the forward-reference logic to create only one basic block (and always return the same blockaddress). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215805 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 01:54:37 +00:00
Duncan P. N. Exon Smith	7a5cb43115	IR: Don't add inbounds to GEPs of extern_weak variables Global variables that have `extern_weak` linkage may be null, so it's incorrect to add `inbounds` when constant folding. This also fixes a bug when parsing global aliases, whose forward reference placeholders are global variables with `extern_weak` linkage. If GEPs to these aliases are encountered before the alias itself, the GEPs would incorrectly gain the `inbounds` keyword as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215803 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 01:54:32 +00:00
Andrea Di Biagio	89cea3c36b	[DAGCombiner] Improve the folding of target independet shuffles to Undef. When combining a pair of shuffle nodes, check if the combined shuffle mask is trivially Undef. In case, immediately fold that pair of shuffles to Undef. The lack of checks for undef masks was the root-cause of a poor-codegen bug in the dag combiner. Example: %1 = shufflevector <4 x i32> %A, <4 x i32> %B, <4 x i32> <i32 4, i32 1, i32 1, i32 6> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 0, i32 4, i32 1, i32 6> %3 = shufflevector <4 x i32> %2, <4 x i32> undef, <4 x i32> <i32 1, i32 5, i32 3, i32 3> Before this patch, on x86 (with -mcpu=corei7) we failed to fold the entire sequence to Undef value and therefore we generated: shufps $-123, %xmm1, $xmm0 pshufd $-46, %xmm0, %xmm0 With this patch, the entire shuffle sequence is folded to Undef and no shuffles are generated in the output assembly. Added new test cases to test 'combine-vec-shuffle-5.ll'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215797 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 00:29:44 +00:00
Hal Finkel	5dc48ac04a	[PowerPC] Mark fixed-offset byvals as pointed-to by IR values A byval object, even if allocated at a fixed offset (prescribed by the ABI) is pointed to by IR values. Most fixed-offset stack objects are not pointed-to by IR values, so the default is to assume this is not possible. However, we need to override the default in this case (instruction scheduling can cause miscompiles otherwise). Fixes PR20280. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215795 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 00:17:05 +00:00
Chad Rosier	cc921d6f41	[AArch32] Add support for FP rounding operations for ARMv8/AArch32. Phabricator Revision: http://reviews.llvm.org/D4935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215772 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 21:38:16 +00:00
Rafael Espindola	0549fc2448	Set comdats when lazily linking functions. We were setting the comdat when functions were copied in the initial pass, but not when they were linked only when we found out that they are needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215765 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 20:17:08 +00:00
Matt Arsenault	c86e55eb6e	R600/SI: Move all fabs / fneg handling to patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215749 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 18:42:22 +00:00
Matt Arsenault	0498d07255	R600/SI: Use source modifiers for f64 fneg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215748 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 18:42:18 +00:00
Matt Arsenault	c882fc78fe	R600/SI: Use source modifier for f64 fabs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215747 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 18:42:15 +00:00
Matt Arsenault	34ef4cd65b	R600/SI: Fix offset folding in some cases with shifted pointers. Ordinarily (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2) is only done if the add has one use. If the resulting constant add can be folded into an addressing mode, force this to happen for the pointer operand. This ends up happening a lot because of how LDS objects are allocated. Since the globals are allocated next to each other, acessing the first element of the second object is directly indexed by a shifted pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215739 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:49:05 +00:00
Chandler Carruth	92ee945e2e	[x86] Teach the new AVX v4f64 shuffle lowering to use UNPCK instructions where applicable for blending. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215737 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:42:00 +00:00
Matt Arsenault	5bc44c7603	R600/SI: Add intrinsic for ldexp git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:30:25 +00:00
Juergen Ributzka	e2bb4f981b	[FastISel][ARM] Fix unit test from r215682. Thanks Jim for finding this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215733 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:23:20 +00:00
Matt Arsenault	ed76ca720b	R600/SI: Implement isLegalAddressingMode The default assumes that a 16-bit signed offset is used. LDS instruction use a 16-bit unsigned offset, so it wasn't being used in some cases where it was assumed a negative offset could be used. More should be done here, but first isLegalAddressingMode needs to gain an addressing mode argument. For now, copy most of the rest of the default implementation with the immediate offset change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215732 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:17:07 +00:00
Moritz Roth	d84561bf69	ARM: Fix and re-enable load/store optimizer for Thumb1. In a previous iteration of the pass, we would try to compensate for writeback by updating later instructions and/or inserting a SUBS to reset the base register if necessary. Since such a SUBS sets the condition flags it's not generally safe to do this. For now, only merge LDR/STRs if there is no writeback to the base register (LDM that loads into the base register) or the base register is killed by one of the merged instructions. These cases are clear wins both in terms of instruction count and performance. Also add three new test cases, and update the existing ones accordingly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 17:00:30 +00:00
Amara Emerson	cef3ad6720	[AArch64] Narrow arguments passed in wrong position on the stack in big-endian mode. Patch by Asiri Rathnayake. Differential Revision: http://reviews.llvm.org/D4922 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215716 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 14:29:57 +00:00
Rafael Espindola	a348fc7fda	Remove HasLEB128. We already require CFI, so it should be safe to require .leb128 and .uleb128. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215712 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 14:01:07 +00:00
Bill Schmidt	44beebe8de	[PPC64] Add test case for r215685. I had deferred adding this test case until I could get it down to a reasonable size. That's done now. Thanks, Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215711 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 13:51:57 +00:00
Chandler Carruth	12e69a0267	[x86] Add the initial skeleton of type-based dispatch for AVX vectors in the new shuffle lowering and an implementation for v4 shuffles. This allows us to handle non-half-crossing shuffles directly for v4 shuffles, both integer and floating point. This currently misses places where we could perform the blend via UNPCK instructions, but otherwise generates equally good or better code for the test cases included to the existing vector shuffle lowering. There are a few cases that are entertainingly better. ;] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215702 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 11:01:40 +00:00
Tim Northover	f52efce72d	ARM: implement MRS/MSR (banked reg) system instructions. These are system-only instructions for CPUs with virtualization extensions, allowing a hypervisor easy access to all of the various different AArch32 registers. rdar://problem/17861345 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215700 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 10:47:12 +00:00
Vladimir Medic	30bb8f60e5	Current implementation of c.cond.fmt instructions only accept default cc0 register. This patch enables the instruction to accept other fcc registers. The aliases with default fcc0 registers are also defined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215698 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 09:29:30 +00:00
Chandler Carruth	886f0101a7	[x86] Fix the very broken formation of vpunpck instructions in the target-specific shuffl DAG combines. We were recognizing the paired shuffles backwards. This code needs to be replaced anyways as we have the same functionality elsewhere, but I'll do the refactoring in a follow-up, this is the minimal fix to the behavior. In addition to fixing miscompiles with the new vector shuffle lowering, it also causes the canonicalization to kick in much better, selecting the smaller encoding variants in lots of places in the new AVX path. This still isn't quite ideal as we don't need both the shufpd and the punpck instructions, but that'll get fixed in a follow-up patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 03:54:49 +00:00
Chandler Carruth	477f28c48d	[x86] Fix PR20540 where the x86 shuffle DAG combiner had completely broken logic for merging shuffle masks in the face of SM_SentinelZero mask operands. While these are '-1' they don't mean 'undef' the way '-1' means in the pre-legalized shuffle masks. Instead, they mean that the shuffle operation is forcibly zeroing that lane. Reflect this and explicitly handle it in a bunch of places. In one place the effect is equivalent but much more clear. In the rest it was really weirdly broken. Also, rewrite the entire merging thing to be a more directy operation with a single loop and just doing math to map the indices through the various masks. Also add a bunch of asserts to try to make in extremely clear what the different masks can possibly look like. Finally, add some comments to clarify that we're merging shuffle masks up here rather than down as we do everywhere else, and thus the logic is quite confusing. Thanks to several different people for sending test cases, and for Robert Khasanov for an initial attempt at fixing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 02:43:18 +00:00
Juergen Ributzka	266ecacfaa	[FastISel][ARM] Fall-back to constant pool loads when materializing an i32 constant. FastEmit_i won't always succeed to materialize an i32 constant and just fail. This would trigger a fall-back to SelectionDAG, which is really not necessary. This fix will first fall-back to a constant pool load to materialize the constant before giving up for good. This fixes <rdar://problem/18022633>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215682 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 23:29:49 +00:00
Hal Finkel	e1e7862f6e	Copy noalias metadata from call sites to inlined instructions When a call site with noalias metadata is inlined, that metadata can be propagated directly to the inlined instructions (only those that might access memory because it is not useful on the others). Prior to inlining, the noalias metadata could express that a call would not alias with some other memory access, which implies that no instruction within that called function would alias. By propagating the metadata to the inlined instructions, we preserve that knowledge. This should complete the enhancements requested in PR20500. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215676 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 21:09:37 +00:00
Juergen Ributzka	6398a7f5fd	Revert several FastISel commits to track down a buildbot error. This reverts: r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants." r215594 "[FastISel][X86] Use XOR to materialize the "0" value." r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization." r215591 "[FastISel][AArch64] Make use of the zero register when possible." r215588 "[FastISel] Let the target decide first if it wants to materialize a constant." r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI." git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 19:56:28 +00:00
Adam Nemet	41c5e687ed	[AVX512] Add test for FMA masking instrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215665 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 17:13:33 +00:00
Adam Nemet	90eb948fc9	[AVX512] Switch FMA intrinsics to the masking version This does the renaming and updates the lowering logic. Part of <rdar://problem/17688758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215664 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 17:13:30 +00:00
Juergen Ributzka	14bc045838	Revert "[FastISel][AArch64] Add support for more addressing modes." This reverts commits r215597, because it might have broken the build bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215659 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 17:10:54 +00:00
Hal Finkel	b1b9953473	Add noalias metadata for general calls (not just memory intrinsics) during inlining When preserving noalias function parameter attributes by adding noalias metadata in the inliner, we should do this for general function calls (not just memory intrinsics). The logic is very similar to what already existed (except that we want to add this metadata even for functions taking no relevant parameters). This metadata can be used by ModRef queries in the caller after inlining. This addresses the first part of PR20500. Adding noalias metadata during inlining is still turned off by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215657 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 16:44:03 +00:00
Chad Rosier	3b41039163	[Reassociation] Add support for reassociation with unsafe algebra. Vector instructions are (still) not supported for either integer or floating point. Hopefully, that work will be landed shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215647 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 15:23:01 +00:00
Sanjay Patel	9615d702ad	optimize vector fneg of bitcasted integer value This patch allows a vector fneg of a bitcasted integer value to be optimized in the same way that we already optimize a scalar fneg. If the integer variable is a constant, we can precompute the result and not require any logic ops. This patch is very similar to a fabs patch committed at r214892. Differential Revision: http://reviews.llvm.org/D4852 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215646 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 15:15:28 +00:00
Rafael Espindola	1c9caced63	Delete support for AuroraUX. auroraux.org is not resolving. I will add this to the release notes as soon as I figure out where to put the 3.6 release notes :-) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215645 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 15:15:09 +00:00
Toma Tabacu	0b2081a05a	[mips] Improve robustness of some tests. Summary: This is done by removing some hardcoded registers like $at or expecting a single digit register to be selected. Contains work done by Matheus Almeida. Reviewers: matheusalmeida, dsanders Reviewed By: dsanders Subscribers: tomatabacu Differential Revision: http://reviews.llvm.org/D4227 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215640 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 13:10:48 +00:00
Chandler Carruth	cad1711154	[x86] Begin stubbing out the AVX support in the new vector shuffle lowering scheme. Currently, this just directly bails to the fallback path of splitting the 256-bit vector into two 128-bit vectors, operating there, and then joining the results back together. While the results are far from perfect, they are shockingly good for what we're doing here. I'll be layering the rest of the functionality on top of this piece by piece and updating tests as I go. Note that 256-bit vectors in this mode are still somewhat WIP. While I think the code paths that I'm adding here are clean and good-to-go, there are still a lot of 128-bit assumptions that I'll need to stomp out as I march through the functional spread here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215637 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 12:13:59 +00:00
Toma Tabacu	cb43f81fc5	[mips] Add assembler support for the "la $reg,symbol" pseudo-instruction. Summary: This pseudo-instruction allows the programmer to load an address from a symbolic expression into a register. Patch by David Chisnall. His work was sponsored by: DARPA, AFRL I've made some minor changes to the original, such as improving the formatting and adding some comments, and I've also added a test case. Reviewers: dsanders Reviewed By: dsanders Differential Revision: http://reviews.llvm.org/D4808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215630 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 10:29:17 +00:00
Chandler Carruth	369e0ef67d	[SDAG] Fix a bug in the DAG combiner where we would fail to return the input node after manually adding it to the worklist and using CombineTo. Once we use CombineTo the input node may have been deleted. Despite this being completely confusing and somewhat broken, the only way to "correctly" return from a DAG combine after potentially deleting the input node is to return that exact node.... But really, this code should just never have used CombineTo. It won't do what it wants (returning the node as mentioned above just causes the combine to infloop). The correct way to combine away a casted load to a load of the correct type is to RAUW the chain directly and then return the loaded value to replace the actual value node. I managed to find this with the vector shuffle fuzzer even though it clearly has nothing at all to do with vector shuffles and rather those happen to trigger a load of a constant pool that hits this combine just right. I've included the test as it is small and a nice stress test that the infrastructure isn't asserting. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215622 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 08:18:34 +00:00
David Majnemer	eb323b2b3c	InstCombine: ((A \| ~B) ^ (~A \| B)) to A ^ B Proof using CVC3 follows: $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR((A \| ~B),(~A \|B)) = BVXOR(A,B); $ cvc3 t.cvc Valid. Patch by Mayur Pandey! Differential Revision: http://reviews.llvm.org/D4883 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 06:46:25 +00:00
David Majnemer	923556f8a8	Added InstCombine Transform for ((B \| C) & A) \| B -> B \| (A & C) Transform ((B \| C) & A) \| B --> B \| (A & C) Z3 Link: http://rise4fun.com/Z3/hP6p Patch by Sonam Kumari! Differential Revision: http://reviews.llvm.org/D4865 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215619 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 06:41:38 +00:00
Saleem Abdulrasool	0086358325	MC: AsmLexer: handle multi-character CommentStrings correctly As X86MCAsmInfoDarwin uses '##' as CommentString although a single '#' starts a comment a workaround for this special case is added. Fixes divisions in constant expressions for the AArch64 assembler and other targets which use '//' as CommentString. Patch by Janne Grunau! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215615 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 02:51:43 +00:00
Chandler Carruth	14ee003f1a	[SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it stops being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215611 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-14 01:07:37 +00:00
Akira Hatanaka	d0ddfb0896	[AArch64, fast-isel] Fall back to SelectionDAG to select tail calls. Certain functions such as objc_autoreleaseReturnValue have to be called as tail-calls even at -O0. Since normal fast-isel doesn't emit calls as tail calls, we have to fall back to SelectionDAG to select calls that are marked as tail. <rdar://problem/17991614> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215600 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 23:23:58 +00:00
Juergen Ributzka	8c9a0319bb	[FastISel][AArch64] Add support for more addressing modes. FastISel didn't take much advantage of the different addressing modes available to it on AArch64. This commit allows the ComputeAddress method to recognize more addressing modes that allows shifts and sign-/zero-extensions to be folded into the memory operation itself. For Example: lsl x1, x1, #3 --> ldr x0, [x0, x1, lsl #3] ldr x0, [x0, x1] sxtw x1, w1 lsl x1, x1, #3 --> ldr x0, [x0, x1, sxtw #3] ldr x0, [x0, x1] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215597 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:53:29 +00:00
Juergen Ributzka	b677a877c8	[FastISel][X86] Add large code model support for materializing floating-point constants. In the large code model for X86 floating-point constants are placed in the constant pool and materialized by loading from it. Since the constant pool could be far away, a PC relative load might not work. Therefore we first materialize the address of the constant pool with a movabsq and then load from there the floating-point value. Fixes <rdar://problem/17674628>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215595 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:25:35 +00:00
Juergen Ributzka	0701e5d43b	[FastISel][X86] Use XOR to materialize the "0" value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215594 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:22:17 +00:00
Juergen Ributzka	f245d9aa77	[FastISel][X86] Emit more efficient instructions for integer constant materialization. This mostly affects the i64 value type, which always resulted in an 15byte mobavsq instruction to materialize any constant. The custom code checks the value of the immediate and tries to use a different and smaller mov instruction when possible. This fixes <rdar://problem/17420988>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215593 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:18:11 +00:00
Juergen Ributzka	dc408e8069	[FastISel][AArch64] Make use of the zero register when possible. This change materializes now the value "0" from the zero register. The zero register can be folded by several instruction, so no materialization is need at all. Fixes <rdar://problem/17924413>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215591 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:13:14 +00:00
Juergen Ributzka	eb1c51f8b3	[FastISel] Let the target decide first if it wants to materialize a constant. This changes the order in which FastISel tries to materialize a constant. Originally it would try to use a simple target-independent approach, which can lead to the generation of inefficient code. On X86 this would result in the use of movabsq to materialize any 64bit integer constant - even for simple and small values such as 0 and 1. Also some very funny floating-point materialization could be observed too. On AArch64 it would materialize the constant 0 in a register even the architecture has an actual "zero" register. On ARM it would generate unnecessary mov instructions or not use mvn. This change simply changes the order and always asks the target first if it likes to materialize the constant. This doesn't fix all the issues mentioned above, but it enables the targets to implement such optimizations. Related to <rdar://problem/17420988>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215588 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 22:08:02 +00:00
Juergen Ributzka	047423787c	[FastISel][ARM] Use MOVT/MOVW if the subtarget requests it. This change is also in preparation for a future change to make sure that the constant materialization uses MOVT/MOVW when available and not a load from the constant pool. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215584 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 21:42:19 +00:00
Benjamin Kramer	4551342b53	Fix (re-)creation of unittest lit.site.cfg for clang-tools-extra. This has been hiding really well. Hopefully brings the builders suffering from outdated lit.site.cfg files back to life. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215575 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 20:41:26 +00:00
Jan Vesely	d3fa093dc9	utils: Fix segfault in flattencfg v2: continue iterating through the rest of the bb use for loop v3: initialize FlattenCFG pass in ScalarOps add test v4: split off initializing flattencfg to a separate patch add comment Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 20:31:53 +00:00
Matt Arsenault	bd949eea85	R600: Correctly set the src value offset for scalarized kernel args This for some reason fixes v1i64 kernel arguments on pre-SI. This currently breaks some other cases in the kernel-args.ll test for R600, but I'm not particularly confident in the new output. VTX_READ_* are not used for some of the scalarized cases, and the code reading from the constant buffer doesn't make much sense to me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215564 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 18:14:11 +00:00
Andrea Di Biagio	05a76eb9f2	[DAGCombiner] Improved target independent vector shuffle combine rule. This patch improves the existing algorithm in DAGCombiner that attempts to fold shuffles according to rule: shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3) Before this change, there were cases where the DAGCombiner conservatively avoided folding shuffles even if the resulting mask would have been legal. That is because the algorithm wrongly assumed that commuting an illegal shuffle mask would always produce an illegal mask. With this change, we now correctly compute the commuted shuffle mask before calling method 'isShuffleMaskLegal' on it. On X86, this improves for example the codegen for the following function: define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) { %1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7> %2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3> ret <4 x i32> %2 } Before this change the X86 backend (-mcpu=corei7) generated the following assembly code for function @test: shufps $-23, %xmm0, %xmm1 # xmm1 = xmm1[1,2],xmm0[2,3] movhlps %xmm1, %xmm1 # xmm1 = xmm1[1,1] movaps %xmm1, %xmm0 Now we produce: movhlps %xmm0, %xmm0 # xmm0 = xmm0[1,1] Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly fold according to the above-mentioned rule. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215555 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 16:09:40 +00:00
Chandler Carruth	701073e58e	[optnone] Make the optnone attribute effective at suppressing function attribute and function argument attribute synthesizing and propagating. As with the other uses of this attribute, the goal remains a best-effort (no guarantees) attempt to not optimize the function or assume things about the function when optimizing. This is particularly useful for compiler testing, bisecting miscompiles, triaging things, etc. I was hitting specific issues using optnone to isolate test code from a test driver for my fuzz testing, and this is one step of fixing that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215538 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 10:49:33 +00:00
Robert Khasanov	232202439a	[SKX] Extended non-temporal load/store instructions for AVX512VL subsets. Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction. Added encoding and lowering tests. Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215536 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 10:46:00 +00:00
Daniel Sanders	5d16d6c3f0	Re-commit: [mips] Implement .ent, .end, .frame, .mask and .fmask. Patch by Matheus Almeida and Toma Tabacu The lld test failure on the previous attempt to commit was caused by the addition of the .pdr section causing the offsets it was checking to change. This has been fixed by removing the .ent/.end directives from that test since they weren't really needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215535 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 10:07:34 +00:00
Chandler Carruth	5e5aa9438d	Revert r215415 which causse MSan to crash on a great deal of C++ code. I've followed up on the original commit as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215532 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 09:19:39 +00:00
Elena Demikhovsky	4c97c1420b	AVX-512: Fixed a bug in shufflevector lowering. PALIGNR instruction does not exist in AVX-512F set. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215526 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 07:58:43 +00:00
Karthik Bhat	7ef167ae1f	InstCombine: Combine (xor (or %a, %b) (xor %a, %b)) to (add %a, %b) Correctness proof of the transform using CVC3- $ cat t.cvc A, B : BITVECTOR(32); QUERY BVXOR(A \| B, BVXOR(A,B) ) = A & B; $ cvc3 t.cvc Valid. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215524 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 05:13:14 +00:00
Chandler Carruth	6bb093bbe7	[x86] Rewrite a core part of the new vector shuffle lowering to handle one pesky test case correctly. This test case caused the old code to infloop occilating between solving the low-half and the high-half. The 'side balancing' part of single-input v8 shuffle lowering didn't handle the one pattern which can cause it to occilate. Fortunately the fuzz testing found this case. Unfortuately it was terrible to handle. I'm really sorry for the amount and density of the code here, I'd love suggestions on how to simplify it. I feel like there must be a simpler form here, but after a lot of days I've not found it. This is the only one I've found that even works. I've added the one pesky test case along with some nice comments explaining the core problem that we have to solve here. So far this has survived approximately 32k test cases. More strenuous fuzzing commencing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215519 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 01:25:45 +00:00
Hal Finkel	e693d3c558	[PowerPC] Implement PPCTargetLowering::getTgtMemIntrinsic This implements PPCTargetLowering::getTgtMemIntrinsic for Altivec load/store intrinsics. As with the construction of the MachineMemOperands for the intrinsic calls used for unaligned load/store lowering, the only slight complication is that we need to represent a larger memory range than the loaded/stored value-type size (because the address is rounded down to an aligned address, and we need to conservatively represent the entire possible range of the actual access). This required adding an extra size field to TargetLowering::IntrinsicInfo, and this was done in a way that required no modifications to other targets (the size defaults to the store size of the provided memory data type). This fixes test/CodeGen/PowerPC/unal-altivec-wint.ll (so it can be un-XFAILed). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215512 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 01:15:40 +00:00
Hal Finkel	695e914c03	Fix classof for ISD::INTRINSIC_W_CHAIN and INTRINSIC_VOID Unfortunately, our use of the SDNode class hierarchy for INTRINSIC_W_CHAIN and INTRINSIC_VOID nodes is somewhat broken right now. These nodes sometimes are used for memory intrinsics (those with MachineMemOperands), and sometimes not. When not, the nodes are not created as instances of MemIntrinsicSDNode, but rather created as some other subclass of SDNode using DAG::getNode. When they are memory intrinsics, they are created using DAG::getMemIntrinsicNode as instances of MemIntrinsicSDNode. MemIntrinsicSDNode is a subclass of MemSDNode, but prior to r214452, we had a non-self-consistent setup whereby MemIntrinsicSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID would return true but MemSDNode::classof on INTRINSIC_W_CHAIN and INTRINSIC_VOID would return false. In r214452, MemSDNode::classof was changed to return true for INTRINSIC_W_CHAIN and INTRINSIC_VOID, which is now self-consistent. The problem is that neither the pre-r214452 logic and the post-r214452 logic are really right. The truth is that not all INTRINSIC_W_CHAIN and INTRINSIC_VOID nodes are instances of MemIntrinsicSDNode (or MemSDNode for that matter), and the return value from classof needs to reflect that. This was broken before r214452 (because MemIntrinsicSDNode::classof always returned true), and was broken afterward (because MemSDNode::classof also always returned true), and will now be correct. The minimal solution is to grab one of the SubclassData bits (there is one left for MemIntrinsicSDNode nodes) and use it to store whether or not a particular INTRINSIC_W_CHAIN or INTRINSIC_VOID is really an instance of MemIntrinsicSDNode or not. Doing this allows both MemIntrinsicSDNode::classof and MemSDNode::classof to return the correct answer for the underlying object for both the memory-intrinsic and non-memory-intrinsic cases. This fixes the problem that r214452 created in the SelectionDAGDumper (thanks to Matt Arsenault for pointing it out). Because PowerPC does not implement getTgtMemIntrinsic, this change breaks test/CodeGen/PowerPC/unal-altivec-wint.ll. I've XFAILed it for now, and will fix it in a follow-up commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215511 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 01:15:37 +00:00
Adam Nemet	4c9467ea5d	[AVX512] Verify the code generated for the intrinsic _mm512_broadcastsd_pd git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215487 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-13 00:30:05 +00:00
Adam Nemet	6ea7f36872	[AVX512] Handle valign masking intrinsic via C++ lowering I think that this will scale better in most cases than adding a Pat<> for each mapping from the intrinsic DAG to the intruction (i.e. rri, rrik, rrikz). We can just lower to the SDNode and have the resulting DAG be matches by the DAG patterns. Alternatively (long term), we could keep the Pat<>s but generate them via the new AVX512_masking multiclass. The difficulty is that in order to formulate that we would have to concatenate DAGs. Currently this is only supported if the operators of the input DAGs are identical. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215473 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 21:13:12 +00:00
Matt Arsenault	00139e51c9	Allwo bitcast + struct GEP transform to work with addrspacecast git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215467 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 19:46:13 +00:00
Jan Vesely	3c57820bbb	R600: Use optimized 24bit path in udivrem v2: drop enum keyword use correct extension mode don't bother computing the sign in unsinged case Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215462 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 17:31:20 +00:00
Jan Vesely	b40562c0ec	R600: Use i24 optimized path for SREM v2: add tests rename LowerSDIV24 to LowerSDIVREM24 handle the rem part in this function Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215460 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 17:31:17 +00:00
Duncan P. N. Exon Smith	49135bb76c	Don't upgrade global constructors when reading bitcode An optional third field was added to `llvm.global_ctors` (and `llvm.global_dtors`) in r209015. Most of the code has been changed to deal with both versions of the variables. Users of the C API might create either version, the helper functions in LLVM create the two-field version, and clang now creates the three-field version. However, the BitcodeReader was changed to always upgrade to the three-field version. This created an unnecessary inconsistency in the IR before/after serializing to bitcode. This commit resolves the inconsistency by making the third field truly optional (and not upgrading in the bitcode reader). Since `llvm-link` was relying on this upgrade code, rather than deleting it I've moved it into `ModuleLinker`, where it upgrades these arrays as necessary to resolve inconsistencies between modules. The ideal resolution would be to remove the 2-field version and make the third field required. I filed PR20506 to track that. I changed `test/Bitcode/upgrade-global-ctors.ll` to a negative test and duplicated the `llvm-link` check in `test/Linker/global_ctors.ll` to check both upgrade directions. Since I came across this as part of PR5680 (serializing use-list order), I've also added the missing `verify-uselistorder` RUN line to `test/Bitcode/metadata-2.ll`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215457 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 16:46:37 +00:00
Rafael Espindola	8b14a499f9	Make the test a bit more strict. Before it would pass even if @b or @c ended up pointing to a variable named @a123. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215450 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 15:55:27 +00:00
Rafael Espindola	f531e92671	Add a plugin testcase for merging weak variables. I initially thought I could implement COMDATs with aliases by just internalizing GVs instead of dropping them. This is a counter example: Internalizing one of the @a would make @b and @c point to different variables. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215447 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 15:39:14 +00:00
NAKAMURA Takumi	87d7e906ce	llvm/test/TableGen/Foreach.td: Remove XFAIL:vg_leak. They have not been failing since r215176. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215445 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 14:06:21 +00:00
Tim Northover	48bfab80aa	llvm-objdump: print contents of MachO __unwind_info sections git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215437 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 11:52:59 +00:00
Gerolf Hoflehner	392f7d970c	[MachineCombiner] Fix for ICE bug 20598 The combiner ignored DBG nodes when checking the uses of a virtual register. It combined a sequence like %vreg1 = madd %vreg2, %vreg3,... DBG_VALUE (%vreg1 ...) %vreg4 = add %vreg1,... to %vreg4 = madd %vreg2, %vreg3 leaving behind a dangling DBG_VALUE with a definition. This triggered an assertion in the MachineTraceMetrics.cpp module. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215431 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 07:54:12 +00:00
Adrian Prantl	a6bd9bd30e	DebugLocEntry: Restore the comparison predicate from before the refactoring in 215384. This way it can unique multiple entries describing the same piece even if they don't have the exact same location. (The same piece may get merged in and be added from OpenRanges). There ought to be a more elegant solution for this, though. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215418 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 01:07:53 +00:00
Reid Kleckner	23761603fe	msan: Handle musttail calls First, avoid calling setTailCall(false) on musttail calls. The funciton prototypes should be "congruent", so the shadow layout should be exactly the same. Second, avoid inserting instrumentation after a musttail call to propagate the return value shadow. We don't need to propagate the result of a tail call, it should already be in the right place. Reviewed By: eugenis Differential Revision: http://reviews.llvm.org/D4331 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215415 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-12 00:12:43 +00:00
Michael J. Spencer	4935833df5	[x86] Fold extract_vector_elt of a load into the Load's address computation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215409 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 23:49:33 +00:00
David Majnemer	e8be18e8a3	InstCombine: Combine (add (and %a, %b) (or %a, %b)) to (add %a, %b) What follows bellow is a correctness proof of the transform using CVC3. $ < t.cvc A, B : BITVECTOR(32); QUERY BVPLUS(32, A & B, A \| B) = BVPLUS(32, A, B); $ cvc3 < t.cvc Valid. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215400 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 22:32:02 +00:00
Tom Stellard	13f4476c55	R600/SI: Add a ComplexPattern for selecting MUBUF _OFFSET variant This saves us from having to copy a 64-bit 0 value into VGPRs for BUFFER_* instruction which only have a 12-bit immediate offset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215399 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 22:18:17 +00:00
Tom Stellard	68e9ebbe44	R600/SI: Add check for low 32 bits of encoding to mubuf tests There are no variable values like registers encoded in the low 32 bits of MUBUF instructions, so it is relatively easy to check these bits, and it will help prevent us from introducing encoding bugs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215397 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 22:18:11 +00:00
Tom Stellard	728d0e4218	R600/SI: Clear lds bit on MUBUF instructions used for private stores This bit was left uninitialized, which was causing some random failures of piglit tests. NOTE: This is a candidate for the 3.5 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215396 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 22:18:09 +00:00
Tom Stellard	0df264a0fd	R600/SI: Fix broken test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215395 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 22:18:05 +00:00
Quentin Colombet	7f4f923aa5	[AArch64] Fix registerAllocator assigns same register for base and wback in pre/post-index load and store. Patch by Steven Wu <stevenwu@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215390 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 21:39:53 +00:00
Saleem Abdulrasool	6c2be4ff95	ARM: try harder to detect non-IT eligible instructions For many Thumb-1 register register instructions, setting the CPSR is not permitted inside an IT block. We would not correctly flag those instructions. The previous change to identify this scenario was insufficient as it did not actually catch all the instances. The current list is formed by manual inspection of the ARMv6M ARM. The change to the Thumb2 IT block test is due to the fact that the new more stringent checking of the MIs results in the If Conversion pass being prevented from executing (since not all the instructions in the BB are predicable). This results in code gen changes. Thanks to Tim Northover for pointing out that the previous patch was insufficient and hinting that the use of the v6M ARM would be much easier to use than the v7 or v8! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215382 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 20:13:25 +00:00
Rafael Espindola	b2d2f28351	Fix using -plugin-opt=apiflie when also using -plugin-opt=emit-llvm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215378 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 19:06:54 +00:00
Sanjay Patel	7c0fa0cfab	Correct a missing RUN line in the ARM codegen test for fneg ops. We should also explicitly specify +/-neonfp. The bug was introduced at r99570 when use of "-arm-use-neon-fp" was removed. Differential Revision: http://reviews.llvm.org/D4846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215377 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 19:04:28 +00:00
Reid Kleckner	d7f37d823b	Add missing test for r215031 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215374 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 18:34:54 +00:00
Reid Kleckner	b81e6cf6c5	MC: Diagnose an unexpected token in COFF .section instead of asserting This can easily arise when trying to assemble and ELF style .section directive for a COFF object file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215373 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 18:34:43 +00:00
Rafael Espindola	69ca26cc97	Fix use of uninitialized variable. Fixes linking bitcode files that use the new style comdats for constructors with ones that don't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215364 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 17:07:34 +00:00
Daniel Sanders	e20f611baf	Revert r215359 - [mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives It seems to cause an lld test (elf/Mips/hilo16-3.test) to fail. Reverted while we investigate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215361 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 16:10:19 +00:00
Daniel Sanders	36cf28e70e	[mips] Implement .ent, .end, .frame, .mask and .fmask assembler directives Patch by Matheus Almeida and Toma Tabacu Differential Revision: http://reviews.llvm.org/D4179 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215359 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 15:28:56 +00:00
Tim Northover	af713d3ba7	AArch64: add support for dynamic-loader relocations LLD needs them, and it's good to be able to print them properly when our object dumpers encounter them. Patch by Daniel Stewart. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215352 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 10:10:27 +00:00
Tim Northover	f34a60c920	llvm-readobj: zero out timestamp in COFF auto-generated test files. The timestamp meant these files changed with each invocation of relocs.py, confusing matters when we add relocations and need to update the tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215350 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 09:53:07 +00:00
Oliver Stannard	17ef00ea94	ARM: __gnu_h2f_ieee and __gnu_f2h_ieee always use the soft-float calling convention By default, LLVM uses the "C" calling convention for all runtime library functions. The half-precision FP conversion functions use the soft-float calling convention, and are needed for some targets which use the hard-float convention by default, so must have their calling convention explicitly set. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215348 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 09:12:32 +00:00
Jiangning Liu	0679d2d0a4	In Machine CSE pass, the source register of a COPY machine instruction can be propagated to all its users, and this propagation could increase the probability of finding common subexpressions. If the COPY has only one user, the COPY itself can be removed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215344 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 05:17:19 +00:00
Jiangning Liu	1505fa4376	In LVI(Lazy Value Info), originally value on a BB can only be caculated once, and the lattice will be updated to be a state other than "undefined". This limiation could miss some opportunities of lowering "overdefined" to be an even accurate value. So this patch ask the algorithm to try to lower the lattice value again even if the value has been lowered to be "overdefined". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215343 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-11 05:02:04 +00:00
Petar Jovanovic	97b0c63f6b	Add support for scalarizing cttz_zero_undef Follow up to r214266. Add missing case in ScalarizeVectorResult() for cttz_zero_undef. Differential Revision: http://reviews.llvm.org/D4813 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215330 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-10 22:49:54 +00:00
Saleem Abdulrasool	3e5734dc38	ARM: correct isPredicable for MULS in ThHUMB mode The ARM ARM states that CPSR may not be updated by a MUL in thumb mode. Due to an ordering of Thumb 2 Size Reduction and If Conversion, we would end up generating a THUMB MULS inside an IT block. The If Conversion pass uses the TTI isPredicable method to ensure that it can transform a Basic Block. However, because we only check for IT handling on Thumb2 functions, we may miss some cases. Even then, it only validates that the CPSR is not live rather than it is not accessed. This corrects the handling for that particular case since the same restriction does not hold on the vast majority of the instructions. This does prevent the IfConversion optimization from kicking in in certain cases, but generating correct code is more valuable. Addresses PR20555. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215328 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-10 22:20:37 +00:00
Joerg Sonnenberger	4417c25b03	@l and friends adjust their value depending the context used in. For ori, they are unsigned, for addi, signed. Create a new target expression type to handle this and evaluate Fixups accordingly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215315 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-10 12:41:50 +00:00
Joerg Sonnenberger	06e8dbef25	Allow the third argument for the subi family to be an expression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215286 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-09 17:10:26 +00:00
Joerg Sonnenberger	0c5a408cf6	Update disassembler test to check the full dccci/iccci form. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215283 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-09 14:01:10 +00:00
Joerg Sonnenberger	26adfd1ca4	Use the full form of dccci and iccci from the early PPC 405 documents, since the operands are actually used on those cores. Provide aliases for the only documented case in the newer Power ISA speec. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215282 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-09 13:58:31 +00:00
Tom Stellard	4e8a136db8	R600/SI: Custom lower CONCAT_VECTORS This will lower them using register copies rather than loads and stores to the stack. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215270 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-09 01:06:56 +00:00
Tom Stellard	f1ba587963	R600/SI: Update concat_vectors.ll to check for scratch usage These tests were using SI-NOT: MOVREL to make sure concat vectors weren't being lowered to stack loads and stores, but we are using scratch buffers for the stack now instead of registers, so we need to add an additional SI-NOT check for scratch buffers. With this change I was able to uncover one broken test which will be fixed in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215269 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-09 01:06:53 +00:00
Joerg Sonnenberger	996a304351	Allow large immediates for branch instructions in 32bit mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215240 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 20:57:58 +00:00
Joerg Sonnenberger	f0b70e2fbc	Provide an implementation of getNoopForMachoTarget for PPC, otherwise empty functions will assert in the MC object writer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215238 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 19:13:23 +00:00
Juergen Ributzka	0980ef248e	[FastISel][X86] Fix INC/DEC optimization (r215230) I accidentally also used INC/DEC for unsigned arithmetic which doesn't work, because INC/DEC don't set the required flag which is used for the overflow check. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215237 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 18:47:04 +00:00
Juergen Ributzka	cbda4b32c6	[FastISel][X86] Use INC/DEC when possible for {sadd\|ssub}.with.overflow intrinsics. This is a small peephole optimization to emit INC/DEC when possible. Fixes <rdar://problem/17952308>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215230 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 17:21:37 +00:00
Joerg Sonnenberger	c9def6b938	Add support for SPE load/store from memory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215220 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 16:43:49 +00:00
Rafael Espindola	d272237f0b	pr20589: Fix duplicated arch flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215216 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 16:18:29 +00:00
Daniel Sanders	a19fc6deb8	[mips] Invert the abicalls feature bit to be noabicalls so that it's possible for -mno-abicalls to take effect. Also added the testcase that should have been in r215194. This behaviour has surprised me a few times now. The problem is that the generated MipsSubtarget::ParseSubtargetFeatures() contains code like this: if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true; so '-abicalls' means 'leave it at the default' and '+abicalls' means 'set it to true'. In this case, (and the similar -modd-spreg case) I'd like the code to be IsABICalls = (Bits & Mips::FeatureABICalls) != 0; or possibly: if ((Bits & Mips::FeatureABICalls) != 0) IsABICalls = true; else IsABICalls = false; and preferably arrange for 'Bits & Mips::FeatureABICalls' to be true by default (on some triples). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215211 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 15:47:17 +00:00
Josh Klontz	b79931d94e	Add missing Interpreter intrinsic lowering for sin, cos and ceil git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215209 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 15:00:12 +00:00
Jiangning Liu	3b85f30319	[AArch64] Fix a type conversion bug for anlyzing compare. The bug can cause spec2006/483.xalancbmk failure. Patched by David Xu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215206 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 14:19:29 +00:00
Daniel Sanders	1952807c9c	[mips] Remove reason for XFAIL from a test that isn't actually XFAILed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215201 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 12:58:17 +00:00
James Molloy	414df79b80	[LoopVectorizer] Enable support for floating-point subtraction reductions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215200 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 12:41:08 +00:00
James Molloy	3a106a2813	[AArch64] Add an FP load balancing pass for Cortex-A57 For best-case performance on Cortex-A57, we should try to use a balanced mix of odd and even D-registers when performing a critical sequence of independent, non-quadword FP/ASIMD floating-point multiply or multiply-accumulate operations. This pass attempts to detect situations where the register allocation may adversely affect this load balancing and to change the registers used so as to better utilize the CPU. Ideally we'd just take each multiply or multiply-accumulate in turn and allocate it alternating even or odd registers. However, multiply-accumulates are most efficiently performed in the same functional unit as their accumulation operand. Therefore this pass tries to find maximal sequences ("Chains") of multiply-accumulates linked via their accumulation operand, and assign them all the same "color" (oddness/evenness). This optimization affects S-register and D-register floating point multiplies and FMADD/FMAs, as well as vector (floating point only) muls and FMADD/FMA. Q register instructions (and 128-bit vector instructions) are not affected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215199 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 12:33:21 +00:00
Tim Northover	bbdf1e0432	AArch64: stop trying to take control of all UnknownArch triples. This short-circuited our error reporting for incorrectly specified target triples (you'd get AArch64 code instead). Should fix PR20567. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215191 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 08:27:44 +00:00
Patrik Hagglund	cf403861a3	[pr19635] Revert most of r170537, and add new testcase. Patch provided by Andrey Kuharev. Sorry, r170537 was obviously wrong. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215190 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 08:21:19 +00:00
David Majnemer	8e5c298a17	GlobalOpt: Optimize in the face of insertvalue/extractvalue GlobalOpt didn't know how to simulate InsertValueInst or ExtractValueInst. Optimizing these is pretty straightforward. N.B. This came up when looking at clang's IRGen for MS ABI member pointers; they are represented as aggregates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215184 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-08 05:50:43 +00:00

1 2 3 4 5 ...

25818 Commits