llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-22 10:29:35 +00:00

Author	SHA1	Message	Date
Tim Northover	8cd39a2630	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 19:19:56 +00:00
Daniel Sanders	8d7b0bdcf0	[mips] Add support for accessing $gp as a named register. Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225529 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 17:21:30 +00:00
Hal Finkel	139bfee84c	[PowerPC] Enable late partial unrolling on the POWER7 The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 15:51:16 +00:00
Saleem Abdulrasool	c2a1df7125	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225510 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	466a7dea9b	test: add additional test for SVN r225507 Add an additional test case to ensure that we generate the relocation even if the thumb target is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225509 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 06:57:18 +00:00
Saleem Abdulrasool	ea4fe48b22	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225507 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 05:59:12 +00:00
Matthias Braun	c41acffe22	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 03:01:31 +00:00
Hal Finkel	4e98296890	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225493 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 01:34:30 +00:00
Duncan P. N. Exon Smith	3408708548	Utils: Keep distinct MDNodes distinct in MapMetadata() Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225476 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith	f416d72973	IR: Add 'distinct' MDNodes to bitcode and assembly Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:38:29 +00:00
Hal Finkel	b7c01bf403	[PowerPC] Mark all instructions as non-cheap for MachineLICM MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225471 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:11:49 +00:00
Akira Hatanaka	40cd57eb5c	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:44:50 +00:00
Matt Arsenault	3b1f741856	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225465 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:09:34 +00:00
Lang Hames	1b3d915de6	[MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout strings from the copies that remain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225460 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 18:52:15 +00:00
Rafael Espindola	8aab70ebfe	Make this test a bit stricter. It now checks for the end of the line or the opening '{'. While at it, remove empty comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225451 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 16:11:18 +00:00
Justin Hibbits	77e85a150c	Add saving and restoring of r30 to the prologue and epilogue, respectively Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225450 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:47:19 +00:00
Kristof Beyls	d1cee9b3bc	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225446 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:09:14 +00:00
Tom Stellard	9a6e4f08fe	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225445 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:08:17 +00:00
Elena Demikhovsky	6e8b53da17	Masked Load/Store - fixed a bug in type legalization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:29:19 +00:00
Michael Kuperstein	477eba5f81	Fix a think-o in the test for r225438. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225440 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:05:02 +00:00
Michael Kuperstein	0858c28ca8	[X86] Don't try to generate direct calls to TLS globals The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225438 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 11:50:58 +00:00
Craig Topper	cb964a5c58	Fix test case I missed in r225432. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225434 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 07:57:27 +00:00
Craig Topper	367b67df3e	[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225432 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 07:41:30 +00:00
Adrian Prantl	7e44a65e6b	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." This reverts commit r225379 while investigating an assertion failure reported by Alexey. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225424 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 02:02:00 +00:00
Quentin Colombet	9d60e0ff0a	[RegAllocGreedy] Introduce a late pass to repair broken hints. A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225422 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 01:16:39 +00:00
Matthias Braun	a065cf13cd	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225415 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 23:58:38 +00:00
Philip Reames	a7f8f932a6	[GC] improve testing around gc.relocate and fix a test Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225412 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:48:01 +00:00
Tom Stellard	a36b682c17	R600/SI: Commute instructions to enable more folding opportunities git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225410 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:44:19 +00:00
Tom Stellard	a3ee583339	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:18:27 +00:00
Duncan P. N. Exon Smith	c742e3a68d	Linker: Don't use MDNode::replaceOperandWith() `MDNode::replaceOperandWith()` changes all instances of metadata. Stop using it when linking module flags, since (due to uniquing) the flag values could be used by other metadata. Instead, use new API `NamedMDNode::setOperand()` to update the reference directly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225397 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:32:27 +00:00
Alexey Samsonov	ec1494b99f	XFAIL several MCJIT EH tests under ASan and MSan bootstrap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225393 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:27:26 +00:00
Rafael Espindola	5061ecc615	Add a test that would have found the issue in r224935. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225385 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:10:25 +00:00
Kevin Enderby	60e9ca4c0f	Slightly refactor things for llvm-objdump and the -macho option so it can be used with options other than just -disassemble so that universal files can be used with other options combined with -arch options. No functional change to existing options and use. One test case added for the additional functionality with a universal file an a -arch option. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225383 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:02:18 +00:00
Olivier Sallenave	033a537a84	More FMA folding opportunities. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:54:17 +00:00
Adrian Prantl	50bf54ccf4	Reapply: Teach SROA how to update debug info for fragmented variables. The two buildbot failures were addressed in LLVM r225378 and CFE r225359. This rapplies commit 225272 without modifications. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225379 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:52:22 +00:00
Adrian Prantl	7950596b55	Debug info: Allow aggregate types to be described by constants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225378 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:48:58 +00:00
Colin LeMahieu	51817073b3	[Hexagon] Adding floating point classification and creation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225374 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:28:57 +00:00
Tom Stellard	f7587043ef	R600/SI: Add a V_MOV_B64 pseudo instruction This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225373 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:27:25 +00:00
Colin LeMahieu	22ddfae848	[Hexagon] Adding encodings for v5 floating point instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225372 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:24:09 +00:00
Colin LeMahieu	22efbc70a7	[Hexagon] Adding encoding for popcount, fastcorner, dword asr with rounding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225371 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:07:28 +00:00
Tom Stellard	546520a727	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225369 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:56:17 +00:00
Philip Reames	28fa9e1e9f	Introduce an example statepoint GC strategy This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.) Most of the validation logic added here as proof of concept will soon move in to the Verifier. That move is dependent on http://reviews.llvm.org/D6811 There was discussion in the review thread about addrspace(1) being reserved for something. I'm going to follow up on a seperate llvmdev thread. If needed, I'll update all the code at once. Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others. Differential Revision: http://reviews.llvm.org/D6808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:07:50 +00:00
David Majnemer	60812c05e7	X86: Allow the stack probe size to be configurable per function LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225360 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 18:14:07 +00:00
Ahmed Bougacha	d412c608fc	[X86] Teach FCOPYSIGN lowering to recognize constant magnitudes. For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225357 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 17:33:03 +00:00
Charlie Turner	7fbbc81d65	[ARM] Add missing Tag_DIV_use tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225348 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 11:37:40 +00:00
Chandler Carruth	7372d445af	[PM] Give slightly less horrible names to the utility pass templates for requiring and invalidating specific analyses. Also make their printed names match their class names. Writing these out as prose really doesn't make sense to me any more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225346 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 11:14:51 +00:00
Karthik Bhat	f2b3638c3d	Revert r225165 and r225169 Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 06:34:34 +00:00
Chandler Carruth	9fc5a53118	[PM] Fix a pretty nasty bug where the new pass manager would invalidate passes too many time. I think this is actually the issue that someone raised with me at the developer's meeting and in an email, but that we never really got to the bottom of. Having all the testing utilities made it much easier to dig down and uncover the core issue. When a pass manager is running many passes over a single function, we need it to invalidate the analyses between each run so that they can be re-computed as needed. We also need to track the intersection of preserved higher-level analyses across all the passes that we run (for example, if there is one module analysis which all the function analyses preserve, we want to track that and propagate it). Unfortunately, this interacted poorly with any enclosing pass adaptor between two IR units. It would see the intersection of preserved analyses, and need to invalidate any other analyses, but some of the un-preserved analyses might have already been invalidated and recomputed! We would fail to propagate the fact that the analysis had already been invalidated. The solution to this struck me as really strange at first, but the more I thought about it, the more natural it seemed. After a nice discussion with Duncan about it on IRC, it seemed even nicer. The idea is that invalidating an analysis causes it to be preserved! Preserving the lack of result is trivial. If it is recomputed, great. Until something else invalidates it again, we're good. The consequence of this is that the invalidate methods on the analysis manager which operate over many passes now consume their PreservedAnalyses object, update it to "preserve" every analysis pass to which it delivers an invalidation (regardless of whether the pass chooses to be removed, or handles the invalidation itself by updating itself). Then we return this augmented set from the invalidate routine, letting the pass manager take the result and use the intersection of that across each pass run to compute the final preserved set. This accounts for all the places where the early invalidation of an analysis has already "preserved" it for a future run. I've beefed up the testing and adjusted the assertions to show that we no longer repeatedly invalidate or compute the analyses across nested pass managers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225333 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 01:58:35 +00:00
Matt Arsenault	6a72b20325	R600/SI: Add combine for isinfinite pattern git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225310 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:46 +00:00
Matt Arsenault	42d9f7cf0a	R600/SI: Pattern match isinf to v_cmp_class instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225307 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:41 +00:00
Matt Arsenault	a5b2b64292	R600/SI: Add basic DAG combines for fp_class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225306 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:39 +00:00
Matt Arsenault	b6520ab625	R600/SI: Add class intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225305 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:37 +00:00
Matt Arsenault	374b57cec9	Fix using wrong intrinsic in test This is a leftover from renaming the intrinsic. It's surprising the unknown llvm. intrinsic wasn't rejected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225304 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:33 +00:00
Rafael Espindola	f907a26bc2	Change the .ll syntax for comdats and add a syntactic sugar. In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:55:16 +00:00
Hal Finkel	8e9ba0e588	[PowerPC] Reuse a load operand in int->fp conversions int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225301 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:31:02 +00:00
Colin LeMahieu	a602a7f199	[Hexagon] Adding compound jump encodings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225291 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 20:03:31 +00:00
Tom Stellard	bac89f3dd2	R600/SI: Insert s_waitcnt before s_barrier instructions. This ensures that all memory operations are complete when all threads reach the barrier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225290 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:52:07 +00:00
Adrian Prantl	d2c42b9617	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." because of a tsan buildbot failure. This reverts commit 225272. Fix should be coming soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225288 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:47:27 +00:00
Colin LeMahieu	3d1d6d9043	[Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, dcfetch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225283 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:03:20 +00:00
Sanjoy Das	31123d4529	This patch teaches IndVarSimplify to add nuw and nsw to certain kinds of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225282 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:02:56 +00:00
Colin LeMahieu	63d0449f11	[Hexagon] Adding encoding information for absolute address loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225279 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 18:38:26 +00:00
Tom Stellard	1f996fa36b	R600/SI: Add a stub GCNTargetMachine This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 18:00:21 +00:00
Andrea Di Biagio	e46783d5b7	[CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz. This patch improves the logic added at revision 224899 (see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz. The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow grap modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz. However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz. With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/truncated in the same basic block, and the zext/trunc instruction is "free" for the target. Added new test cases to CodeGen/X86/cttz-ctlz.ll Differential Revision: http://reviews.llvm.org/D6853 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225274 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:41:18 +00:00
Adrian Prantl	46cb54c0fb	Reapply: Teach SROA how to update debug info for fragmented variables. This also rolls in the changes discussed in http://reviews.llvm.org/D6766. Defers migrating the debug info for new allocas until after all partitions are created. Thanks to Chandler for reviewing! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225272 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:14:10 +00:00
Filipe Cabecinhas	d682839830	Don't loop endlessly for MachO files with 0 ncmds git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225271 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:08:26 +00:00
Hal Finkel	15914b5c22	[PowerPC] Add a regression test for r225251 In r225251, I removed an old entry from the README.txt file. While there are several contributing factors (including pieces in Clang's ABI code), upon further reflection, the backend part deserves a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225268 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:46:37 +00:00
Colin LeMahieu	a24e012976	[Hexagon] Adding dealloc_return encoding and absolute address stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225267 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:15:15 +00:00
Matt Arsenault	d883ca0ca7	Convert fcmp with 0.0 from casted integers to icmp This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225265 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 15:50:59 +00:00
Chandler Carruth	0df89054c0	[PM] Introduce a utility pass that preserves no analyses. Use this to test that path of invalidation. This test actually shows redundant invalidation here that is really bad. I'm going to work on fixing that next, but wanted to commit the test harness now that its all working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225257 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 09:06:35 +00:00
Craig Topper	f6145affbf	[X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16. Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225256 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:59:30 +00:00
David Majnemer	51e4a66417	InstCombine: Bitcast call arguments from/to pointer/integer type Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225254 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:41:31 +00:00
Chandler Carruth	6409bd68de	[PM] Simplify how we parse the outer layer of the pass pipeline text and remove an extra, redundant pass manager wrapping every run. I had kept seeing these when manually testing, but it was getting really annoying and was going to cause problems with overly eager invalidation. The root cause was an overly complex and unnecessary pile of code for parsing the outer layer of the pass pipeline. We can instead delegate most of this to the recursive pipeline parsing. I've added some somewhat more basic and precise tests to catch this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225253 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:37:58 +00:00
David Majnemer	b3065539bd	X86: Don't make illegal GOTTPOFF relocations "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225250 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 07:12:52 +00:00
Hal Finkel	10ae865847	[PowerPC] Improve int_to_fp(fp_to_int(x)) combining The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x)) without a load/store pair is updated here with support for unsigned integers, and to support single-precision values without a third rounding step, on newer cores with the appropriate instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225248 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 06:01:57 +00:00
Chandler Carruth	17395fa733	[PM] Add a utility pass template that synthesizes the invalidation of a specific analysis result. This is quite handy to test things, and will also likely be very useful for debugging issues. You could narrow down pass validation failures by walking these invalidate pass runs up and down the pass pipeline, etc. I've added support to the pass pipeline parsing to be able to create one of these for any analysis pass desired. Just adding this class uncovered one latent bug where the AnalysisManager CRTP base class had a hard-coded Module type rather than using IRUnitT. I've also added tests for invalidation and caching of analyses in a basic way across all the pass managers. These in turn uncovered two more bugs where we failed to correctly invalidate an analysis -- its results were invalidated but the key for re-running the pass was never cleared and so it was never re-run. Quite nasty. I'm very glad to debug this here rather than with a full system. Also, yes, the naming here is horrid. I'm going to update some of the names to be slightly less awful shortly. But really, I've no "good" ideas for naming. I'll be satisfied if I can get it to "not bad". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225246 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 04:49:44 +00:00
Chandler Carruth	5b12a2f703	[PM] Add a collection of no-op analysis passes and switch the new pass manager tests to use them and be significantly more comprehensive. This, naturally, uncovered a bug where the CGSCC pass manager wasn't printing analyses when they were run. The only remaining core manipulator is I think an invalidate pass similar to the require pass. That'll be next. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225240 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 02:50:06 +00:00
Chandler Carruth	a3376d2d36	[PM] Add a utility to the new pass manager for generating a pass which is a no-op other than requiring some analysis results be available. This can be used in real pass pipelines to force the usually lazy analysis running to eagerly compute something at a specific point, and it can be used to test the pass manager infrastructure (my primary use at the moment). I've also added bit of pipeline parsing magic to support generating these directly from the opt command so that you can directly use these when debugging your analysis. The syntax is: require<analysis-name> This can be used at any level of the pass manager. For example: cgscc(function(require<my-analysis>,no-op-function)) This would produce a no-op function pass requiring my-analysis, followed by a fully no-op function pass, both of these in a function pass manager which is nested inside of a bottom-up CGSCC pass manager which is in the top-level (implicit) module pass manager. I have zero attachment to the particular syntax I'm using here. Consider it a straw man for use while I'm testing and fleshing things out. Suggestions for better syntax welcome, and I'll update everything based on any consensus that develops. I've used this new functionality to more directly test the analysis printing rather than relying on the cgscc pass manager running an analysis for me. This is still minimally tested because I need to have analyses to run first! ;] That patch is next, but wanted to keep this one separate for easier review and discussion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225236 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 02:10:51 +00:00
Rafael Espindola	5165dfdf9a	Add a testcase that would have found the problem in r225048. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225235 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 01:41:24 +00:00
Lang Hames	bce877c84c	Revert r225048: It broke ObjC on AArch64. I've filed http://llvm.org/PR22100 to track this issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225228 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 00:54:32 +00:00
Hal Finkel	38c3e2f5c5	[PowerPC] Fix test to pass on Darwin hosts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225220 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 23:17:43 +00:00
Hal Finkel	fcfee17911	[PowerPC] Convert a README.txt entry into a better test We now produce the desired code as noted in the README.txt file (no spurious or). Remove the README entry and improve the regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225214 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:53:52 +00:00
Colin LeMahieu	e4f1dcdb83	[Hexagon] Adding add/sub with carry, logical shift left by immediate and memop instructions. Removing old defs without bits and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225210 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:36:38 +00:00
Hal Finkel	1b84bf2554	[PowerPC] Add a test for truncating a shifted load We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225209 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:33:14 +00:00
Frederic Riss	5a0743e1e8	[dsymutil] Implement the BinaryHolder object and gain archive support. This object is meant to own the ObjectFiles and their underlying MemoryBuffer. It is basically the equivalent of an OwningBinary except that it efficiently handles Archives. It is optimized for efficiently providing mappings of members of the same archive when they are opened successively (which is standard in Darwin debug maps, objects from the same archive will be contiguous). Of course, the BinaryHolder will also be used by the DWARF linker once it is commited, but for now only the debug map parser uses it. With this change, you can run llvm-dsymutil on your Darwin debug build of clang and get a complete debug map for it. Differential Revision: http://reviews.llvm.org/D6690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225207 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:29:28 +00:00
Hal Finkel	e7d845b709	[PowerPC] Add another test for load/store with update We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225205 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:22:42 +00:00
Hal Finkel	ccc83e4a08	[PowerPC] Fold i1 extensions with other ops Consider this function from our README.txt file: int foo(int a, int b) { return (a < b) << 4; } We now explicitly track CR bits by default, so the comment in the README.txt about not really having a SETCC is no longer accurate, but we did generate this somewhat silly code: cmpw 0, 3, 4 li 3, 0 li 12, 1 isel 3, 12, 3, 0 sldi 3, 3, 4 blr which generates the zext as a select between 0 and 1, and then shifts the result by a constant amount. Here we preprocess the DAG in order to fold the results of operations on an extension of an i1 value into the SELECT_I[48] pseudo instruction when the resulting constant can be materialized using one instruction (just like the 0 and 1). This was not implemented as a DAGCombine because the resulting code would have been anti-canonical and depends on replacing chained user nodes, which does not fit well into the lowering paradigm. Now we generate: cmpw 0, 3, 4 li 3, 0 li 12, 16 isel 3, 12, 3, 0 blr which is less silly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225203 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:10:24 +00:00
Colin LeMahieu	ca96263b05	[Hexagon] Adding rounding reg/reg variants, accumulating multiplies, and accumulating shifts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225201 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:56:41 +00:00
Colin LeMahieu	27494b0633	[Hexagon] Adding V4 bit manipulating instructions, removing ALU defs without encoding bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225199 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:35:54 +00:00
Colin LeMahieu	c8e734a561	[Hexagon] Adding V4 logic-logic instructions and tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225198 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:14:58 +00:00
Colin LeMahieu	e48ec2a918	[Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:04:40 +00:00
Hal Finkel	3ab10c1918	[PowerPC] Remove zexts after i32 ctlz The 64-bit semantics of cntlzw are not special, the 32-bit population count is stored as a 64-bit value in the range [0,32]. As a result, it is always zero extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a frontier instruction for the removal of unnecessary zero extensions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225192 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:52:29 +00:00
Hal Finkel	0ef99720c5	[PowerPC] Remove zexts after byte-swapping loads lhbrx and lwbrx not only load their data with byte swapping, but also clear the upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG peephole optimization as frontier instructions for the removal of unnecessary zero extensions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225189 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:09:06 +00:00
Colin LeMahieu	9e989cf190	[Hexagon] Adding round reg/imm and bitsplit instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:08:21 +00:00
Ahmed Bougacha	3c9fb6e1ad	[AArch64] Improve codegen of store lane instructions by avoiding GPR usage. We used to generate code similar to: umov.b w8, v0[2] strb w8, [x0, x1] because the STRro patterns were preferred to ST1. Instead, we can avoid going through GPRs, and generate: add x8, x0, x1 st1.b { v0 }[2], [x8] This patch increases the ST1 AddedComplexity to achieve that. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6202 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225183 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 17:10:26 +00:00
Ahmed Bougacha	c52cd839b9	[AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister. For 0-lane stores, we used to generate code similar to: fmov w8, s0 str w8, [x0, x1, lsl #2] instead of: str s0, [x0, x1, lsl #2] To correct that: for store lane 0 patterns, directly match to STR <subreg>0. Byte-sized instructions don't have the special case for a 0 index, because FPR8s are defined to have untyped content. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 17:02:28 +00:00
NAKAMURA Takumi	19d9f342ed	llvm/test/lit.cfg: have_ld_plugin_support(): Use decode() for stdout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 14:18:04 +00:00
Karthik Bhat	050064d32c	Select lower fsub,fabs pattern to fabd on AArch64 This patch lowers patterns such as- fsub v0.4s, v0.4s, v1.4s fabs v0.4s, v0.4s to fabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225169 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:57:59 +00:00
Charlie Turner	6abfc44aab	Parse Tag_compatibility correctly. Tag_compatibility takes two arguments, but before this patch it would erroneously accept just one, it now produces an error in that case. Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225167 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:26:37 +00:00
Charlie Turner	b99b8ffb7f	Emit the build attribute Tag_conformance. Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225166 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:12:17 +00:00
Karthik Bhat	e239724d12	Select lower sub,abs pattern to sabd on AArch64 This patch lowers patterns such as- sub v0.4s, v0.4s, v1.4s abs v0.4s, v0.4s to sabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6781 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225165 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:11:07 +00:00
Michael Kuperstein	25903ef9bc	Fix broken test from r225159. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225164 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:34:01 +00:00
Chandler Carruth	1ab487fdc7	[PM] Don't run the machinery of invalidating all the analysis passes when all are being preserved. We want to short-circuit this for a couple of reasons. One, I don't really want passes to grow a dependency on actually receiving their invalidate call when they've been preserved. I'm thinking about removing this entirely. But more importantly, preserving everything is likely to be the common case in a lot of scenarios, and it would be really good to bypass all of the invalidation and preservation machinery there. Avoiding calling N opaque functions to try to invalidate things that are by definition still valid seems important. =] This wasn't really inpsired by much other than seeing the spam in the logging for analyses, but it seems better ot get it checked in rather than forgetting about it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225163 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:32:11 +00:00
Chandler Carruth	040ca449b2	[PM] Add names and debug logging for analysis passes to the new pass manager. This starts to allow us to test analyses more easily, but it's really only the beginning. Some of the code here is still untestable without manual changes to create analysis passes, but I wanted to factor it into a small of chunks as possible. Next up in order to be able to test things are, in no particular order: - No-op analyses passes so we don't have to use real ones to exercise the pass maneger itself. - Automatic way of generating dummy passes that require an analysis be run, including a variant that calls a 'print' method on a pass to make it even easier to print out the results of an analysis. - Dummy passes that invalidate all analyses for their IR unit so we can test invalidation and re-runs. - Automatic way to print each analysis pass as it is re-run. - Automatic but optional verification of analysis passes everywhere possible. I'm not claiming I'll get to all of these immediately, but that's what is in the pipeline at some stage. I'm fleshing out exactly what I need and what to prioritize by working on converting analyses and then trying to test the conversion. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225162 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:21:44 +00:00
Jiangning Liu	614fe873ce	Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm. {code} // loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4) {code} The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed. For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized. The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225159 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 10:08:58 +00:00
Hal Finkel	0ef8f3189e	[PowerPC] Enable speculation of cttz/ctlz PPC has an instruction for ctlz with defined zero behavior, and our lowering of cttz (provided by DAGCombine) is also efficient and branchless, so speculating these makes sense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225150 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 05:24:42 +00:00
Chandler Carruth	4f9a7277d1	[SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an assert out of the new pre-splitting in SROA. This fix makes the code do what was originally intended -- when we have a store of a load both dealing in the same alloca, we force them to both be pre-split with identical offsets. This is really quite hard to do because we can keep discovering problems as we go along. We have to track every load over the current alloca which for any resaon becomes invalid for pre-splitting, and go back to remove all stores of those loads. I've included a couple of test cases derived from PR22093 that cover the different ways this can happen. While that PR only really triggered the first of these two, its the same fundamental issue. The other challenge here is documented in a FIXME now. We end up being quite a bit more aggressive for pre-splitting when loads and stores don't refer to the same alloca. This aggressiveness comes at the cost of introducing potentially redundant loads. It isn't clear that this is the right balance. It might be considerably better to require that we only do pre-splitting when we can presplit every load and store involved in the entire operation. That would give more consistent if conservative results. Unfortunately, it requires a non-trivial change to the actual pre-splitting operation in order to correctly handle cases where we end up pre-splitting stores out-of-order. And it isn't 100% clear that this is the right direction, although I'm starting to suspect that it is. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 04:17:53 +00:00
Hal Finkel	9cad6c8a24	[PowerPC] Materialize i64 constants using rotation with masking r225135 added the ability to materialize i64 constants using rotations in order to reduce the instruction count. Sometimes we can use a rotation only with some extra masking, so that we take advantage of the fact that generating a bunch of extra higher-order 1 bits is easy using li/lis. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225147 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 03:41:38 +00:00
Chandler Carruth	51fa09d980	[PM] Wire up support for explicitly running the verifier pass. The required functionality has been there for some time, but I never managed to actually wire it into the command line registry of passes. Let's do that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225144 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 00:08:53 +00:00
Simon Pilgrim	c0c36083da	[X86][SSE] Added vector packing test for pr12412 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225138 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 19:08:03 +00:00
Simon Pilgrim	dc18ec0e0d	[X86][SSE] Added vector integer truncation tests - based off pr15524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225137 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 17:52:00 +00:00
Hal Finkel	2ac0826af3	[PowerPC] Materialize i64 constants using rotation Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing a rotated constant, and then applying the inverse rotation, requires fewer instructions than the direct method. If so, do that instead. In r225132, I added support for forming constants using bit inversion. In effect, this reverts that commit and replaces it with rotation support. The bit inversion is useful for turning constants that are mostly ones into ones that are mostly zeros (thus enabling a more-efficient shift-based materialization), but the same effect can be obtained by using negative constants and a rotate, and that is at least as efficient, if not more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225135 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 15:43:55 +00:00
Hal Finkel	d138a7bb3f	[PowerPC] Materialize i64 constants using bit inversion Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing the bit-reversed constant, and then flipping the bits, requires fewer instructions than the direct method. If so, do that instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225132 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 12:35:03 +00:00
David Majnemer	07d7dbae9e	InstCombine: match can find ConstantExprs, don't assume we have a Value We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225127 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:36:02 +00:00
David Majnemer	77e22b7836	ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes PHI nodes can have zero operands in the middle of a transform. It is expected that utilities in Analysis don't freak out when this happens. Note that it is considered invalid to allow these misshapen phi nodes to make it to another pass. This fixes PR22086. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:06:53 +00:00
Saleem Abdulrasool	b19a485253	llvm-readobj: add support to dump COFF export tables This enhances llvm-readobj to print out the COFF export table, similar to the -coff-import option. This is useful for testing in lld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225120 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 21:35:09 +00:00
Saleem Abdulrasool	97f8f69a7f	ARM: permit tail calls to weak externals on COFF Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225119 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 21:35:00 +00:00
Hal Finkel	e05b232c20	[PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment might depend on the loop itself. For recent POWER cores, loops between 5 and 8 instructions should have 32-byte alignment (while the others are better with 16-byte alignment) so that the entire loop will fit in one i-cache line. To support this, getPrefLoopAlignment has been made virtual, and can be provided with an optional MachineLoop* so the target can inspect the loop before answering the query. The default behavior, as before, is to return the value set with setPrefLoopAlignment. MachineBlockPlacement now queries the target for each loop instead of only once per function. There should be no functional change for other targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 17:58:24 +00:00
Hal Finkel	a1d22cc789	[PowerPC] Use 16-byte alignment for modern cores for functions/loops Most modern PowerPC cores prefer that functions and loops start on 16-byte-aligned boundaries (), so instruct block placement, etc. to make this happen. The branch selector has also been adjusted so account for the extra nops that might now be inserted before loop headers. () Some cores actually prefer other alignments for small loops, but that will be addressed in a follow-up commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225115 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 14:58:25 +00:00
Hal Finkel	958b670c34	[PowerPC] Add support for the CMPB instruction Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225106 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 01:16:37 +00:00
Kostya Serebryany	8c6ae1044a	[asan] simplify the tracing code, make it use the same guard variables as coverage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225103 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 00:54:43 +00:00
Craig Topper	01c99892ca	[X86] Disassembler support for move to/from %rax with a 32-bit memory offset is REX.W and AdSize prefix are both present. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225099 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 00:00:20 +00:00
David Majnemer	5e9c6212a8	InstCombine: Detect when llvm.umul.with.overflow always overflows We know overflow always occurs if both ~LHSKnownZero * ~RHSKnownZero and LHSKnownOne * RHSKnownOne overflow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225077 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 07:29:47 +00:00
Craig Topper	71fc42dbf6	[X86] Make the instructions that use AdSize16/32/64 co-exist together without using mode predicates. This is necessary to allow the disassembler to be able to handle AdSize32 instructions in 64-bit mode when address size prefix is used. Eventually we should probably also support 'addr32' and 'addr16' in the assembler to override the address size on some of these instructions. But for now we'll just use special operand types that will lookup the current mode size to select the right instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225075 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 07:02:25 +00:00
Chandler Carruth	ce7f347da2	[SROA] Teach SROA to be more aggressive in splitting now that we have a pre-splitting pass over loads and stores. Historically, splitting could cause enough problems that I hamstrung the entire process with a requirement that splittable integer loads and stores must cover the entire alloca. All smaller loads and stores were unsplittable to prevent chaos from ensuing. With the new pre-splitting logic that does load/store pair splitting I introduced in r225061, we can now very nicely handle arbitrarily splittable loads and stores. In order to fully benefit from these smarts, we need to mark all of the integer loads and stores as splittable. However, we don't actually want to rewrite partitions with all integer loads and stores marked as splittable. This will fail to extract scalar integers from aggregates, which is kind of the point of SROA. =] In order to resolve this, what we really want to do is only do pre-splitting on the alloca slices with integer loads and stores fully splittable. This allows us to uncover all non-integer uses of the alloca that would benefit from a split in an integer load or store (and where introducing the split is safe because it is just memory transfer from a load to a store). Once done, we make all the non-whole-alloca integer loads and stores unsplittable just as they have historically been, repartition and rewrite. The result is that when there are integer loads and stores anywhere within an alloca (such as from a memcpy of a sub-object of a larger object), we can split them up if there are non-integer components to the aggregate hiding beneath. I've added the challenging test cases to demonstrate how this is able to promote to scalars even a case where we have even partially overlapping loads and stores. This restores the single-store behavior for small arrays of i8s which is really nice. I've restored both the little endian testing and big endian testing for these exactly as they were prior to r225061. It also forced me to be more aggressive in an alignment test to actually defeat SROA. =] Without the added volatiles there, we actually split up the weird i16 loads and produce nice double allocas with better alignment. This also uncovered a number of bugs where we failed to handle splittable load and store slices which didn't have a begininng offset of zero. Those fixes are included, and without them the existing test cases explode in glorious fireworks. =] I've kept support for leaving whole-alloca integer loads and stores as splittable even for the purpose of rewriting, but I think that's likely no longer needed. With the new pre-splitting, we might be able to remove all the splitting support for loads and stores from the rewriter. Not doing that in this patch to try to isolate any performance regressions that causes in an easy to find and revert chunk. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225074 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 03:55:54 +00:00
Chandler Carruth	40a8741994	[SROA] Add a test case for r225068 / PR22080. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225070 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-02 00:34:29 +00:00
Chandler Carruth	450b39e971	[SROA] Teach SROA how to much more intelligently handle split loads and stores. When there are accesses to an entire alloca with an integer load or store as well as accesses to small pieces of the alloca, SROA splits up the large integer accesses. In order to do that, it uses bit math to merge the small accesses into large integers. While this is effective, it produces insane IR that can cause significant problems in the rest of the optimizer: - It can cause load and store mismatches with GVN on the non-alloca side where we end up loading an i64 (or some such) rather than loading specific elements that are stored. - We can't always get rid of the integer bit math, which is why we can't always fix the loads and stores to work well with GVN. - This is especially bad when we have operations that mix poorly with integer bit math such as floating point operations. - It will block things like the vectorizer which might be able to handle the scalar stores that underly the aggregate. At the same time, we can't just directly split up these loads and stores in all cases. If there is actual integer arithmetic involved on the values, then using integer bit math is actually the perfect lowering because we can often combine it heavily with the surrounding math. The solution this patch provides is to find places where SROA is partitioning aggregates into small elements, and look for splittable loads and stores that it can split all the way to some other adjacent load and store. These are uniformly the cases where failing to split the loads and stores hurts the optimizer that I have seen, and I've looked extensively at the code produced both from more and less aggressive approaches to this problem. However, it is quite tricky to actually do this in SROA. We may have loads and stores to the same alloca, or other complex patterns that are hard to handle. This complexity leads to the somewhat subtle algorithm implemented here. We have to do this entire process as a separate pass over the partitioning of the alloca, and split up all of the loads prior to splitting the stores so that we can handle safely the cases of overlapping, including partially overlapping, loads and stores to the same alloca. We also have to reconstitute the post-split slice configuration so we can avoid iterating again over all the alloca uses (the slow part of SROA). But we also have to ensure that when we split up loads and stores to other allocas, we do re-iterate over them in SROA to adapt to the more refined partitioning now required. With this, I actually think we can fix a long-standing TODO in SROA where I avoided splitting as many loads and stores as probably should be splittable. This limitation historically mitigated the fallout of all the bad things mentioned above. Now that we have more intelligent handling, I plan to remove the FIXME and more aggressively mark integer loads and stores as splittable. I'll do that in a follow-up patch to help with bisecting any fallout. The net result of this change should be more fine-grained and accurate scalars being formed out of aggregates. At the very least, Clang now generates perfect code for this high-level test case using std::complex<float>: #include <complex> void g1(std::complex<float> &x, float a, float b) { x += std::complex<float>(a, b); } void g2(std::complex<float> &x, float a, float b) { x -= std::complex<float>(a, b); } void foo(const std::complex<float> &x, float a, float b, std::complex<float> &x1, std::complex<float> &x2) { std::complex<float> l1 = x; g1(l1, a, b); std::complex<float> l2 = x; g2(l2, a, b); x1 = l1; x2 = l2; } This code isn't just hypothetical either. It was reduced out of the hot inner loops of essentially every part of the Eigen math library when using std::complex<float>. Those loops would consistently and pervasively hop between the floating point unit and the integer unit due to bit math extraction and insertion of floating point values that were "stored" in a 64-bit integer register around the loop backedge. So far, this change has passed a bootstrap and I have done some other testing and so far, no issues. That doesn't mean there won't be though, so I'll be prepared to help with any fallout. If you performance swings in particular, please let me know. I'm very curious what all the impact of this change will be. Stay tuned for the follow-up to also split more integer loads and stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225061 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-01 11:54:38 +00:00
Hal Finkel	84cd524ee9	[PowerPC] Improve instruction selection bit-permuting operations (64-bit) This is the second installment of improvements to instruction selection for "bit permutation" instruction sequences. r224318 added logic for instruction selection for 32-bit bit permutation sequences, and this adds lowering for 64-bit sequences. The 64-bit sequences are more complicated than the 32-bit ones because: a) the 64-bit versions of the 32-bit rotate-and-mask instructions work by replicating the lower 32-bits of the value-to-be-rotated into the upper 32 bits -- and integrating this into the cost modeling for the various bit group operations is non-trivial b) unlike the 32-bit instructions in 32-bit mode, the rotate-and-mask instructions cannot, in one instruction, specify the mask starting index, the mask ending index, and the rotation factor. Also, forming arbitrary 64-bit constants is more complicated than in 32-bit mode because the number of instructions necessary is value dependent. Plus, support for 'late masking' was added: it is sometimes more efficient to treat the overall value as if it had no mandatory zero bits when planning the bit-group insertions, and then mask them in at the very end. Unfortunately, as the structure of the bit groups is different in the two cases, the more feasible implementation technique was to generate both instruction sequences, and then pick the shorter one. And finally, we now generate reasonable code for i64 bswap: rldicl 5, 3, 16, 0 rldicl 4, 3, 8, 0 rldicl 6, 3, 24, 0 rldimi 4, 5, 8, 48 rldicl 5, 3, 32, 0 rldimi 4, 6, 16, 40 rldicl 6, 3, 48, 0 rldimi 4, 5, 24, 32 rldicl 5, 3, 56, 0 rldimi 4, 6, 40, 16 rldimi 4, 5, 48, 8 rldimi 4, 3, 56, 0 vs. what we used to produce: li 4, 255 rldicl 5, 3, 24, 40 rldicl 6, 3, 40, 24 rldicl 7, 3, 56, 8 sldi 8, 3, 8 sldi 10, 3, 24 sldi 12, 3, 40 rldicl 0, 3, 8, 56 sldi 9, 4, 32 sldi 11, 4, 40 sldi 4, 4, 48 andi. 5, 5, 65280 andis. 6, 6, 255 andis. 7, 7, 65280 sldi 3, 3, 56 and 8, 8, 9 and 4, 12, 4 and 9, 10, 11 or 6, 7, 6 or 5, 5, 0 or 3, 3, 4 or 7, 9, 8 or 4, 6, 5 or 3, 3, 7 or 3, 3, 4 which is 12 instructions, instead of 25, and seems optimal (at least in terms of code size). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225056 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-01 02:53:29 +00:00
Sanjay Patel	28650b8ec2	InstCombine: fsub nsz 0, X ==> fsub nsz -0.0, X Some day the backend may handle instruction-level fast math flags and make this transform unnecessary, but it's still better practice to use the canonical representation of fneg when possible (use a -0.0). This is a partial fix for PR20870 ( http://llvm.org/bugs/show_bug.cgi?id=20870 ). See also http://reviews.llvm.org/D6723. Differential Revision: http://reviews.llvm.org/D6731 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225050 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 22:14:05 +00:00
Rafael Espindola	8093abb745	Add r224985 back with a fix. The issues was that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. Original message: Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225048 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 17:19:34 +00:00
Colin LeMahieu	f5943bd9a9	Reverting 225045 and 225043 and XFAIL multiline.ll on hexagon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225047 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 17:14:35 +00:00
Rafael Espindola	85419e7e3b	Add a test for the recent compiler-rt build failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225046 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 16:58:05 +00:00
Rafael Espindola	937e781f49	Revert "Remove doesSectionRequireSymbols." This reverts commit r224985. I am investigating why it made an Apple bot unhappy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225044 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 16:06:48 +00:00
Craig Topper	c602b726a8	[X86] Update disassembler tests for absolute move instructions to check the encodings. This provides testing for r225036. 64-bit mode is still broken. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 07:24:23 +00:00
David Majnemer	0f77ccd6bb	InstCombine: try to transform A-B < 0 into A < B We are allowed to move the 'B' to the right hand side if we an prove there is no signed overflow and if the comparison itself is signed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225034 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 04:21:41 +00:00
Alexey Samsonov	c0319dd9c2	Revert "merge consecutive stores of extracted vector elements" This reverts commit r224611. This change causes crashes in X86 DAG->DAG Instruction Selection. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225031 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 00:40:28 +00:00
Colin LeMahieu	96c631b191	[Hexagon] Adding accumulating add/sub, doubleword logic-not variants, doubleword bitfield extract, word parity, accumulating multiplies with saturation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225024 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-31 00:08:34 +00:00
David Blaikie	855324b9de	Fix a test case to not depend on asm comment syntax, so as to be portable Too many different comment characters - instead of trying to account for them all, instead disable the comments and just check for end-of-line instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225020 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 23:33:55 +00:00
David Blaikie	6fcdb2681c	Generalize even further, for ARM comment syntax (@) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225019 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 23:23:58 +00:00
Colin LeMahieu	cb5c5f5934	[Hexagon] Adding double-logic on predicate instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225018 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 23:22:39 +00:00
David Blaikie	2e05c34ba6	Generalize test case to handle different asm syntax (# or // comments) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 23:21:57 +00:00
Colin LeMahieu	6026119d9f	[Hexagon] Adding newvalue compare and jumps. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 23:04:21 +00:00
David Blaikie	1d68fc5021	DebugInfo: Omit is_stmt from line table entries on the same line. GCC does this for non-zero discriminators and since GCC doesn't produce column info, that was the only place it comes up there. For LLVM, since we can emit discriminators and/or column info, it makes more sense to invert the condition and just test for changes in line number. This should resolve at least some of the GDB 7.5 test suite failures created by recent Clang changes that increase the location fidelity (which, since Clang defaults to including column info on Linux by default created a bunch of cases that confused GDB). In theory we could do this better/differently by grouping actual source statements together in a similar manner to the way lexical scopes are handled but given that GDB isn't really in a position to consume that (& users are probably somewhat used to different lines being different 'statements') this seems the safest and cheapest change. (I'm concerned that doing this 'right' would bloat the debugloc data even further - something Duncan's working hard to address) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225011 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 22:47:13 +00:00
Colin LeMahieu	a7940ef0e4	[Hexagon] Adding postincrement register newvalue stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225010 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 22:34:08 +00:00
Colin LeMahieu	df2531486d	[Hexagon] Removing old newvalue store variants. Adding postincrement immediate newvalue stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225009 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 22:28:31 +00:00
Zoran Jovanovic	25547ee83c	[mips][microMIPS] Relocate with symbol for micromips symbols Differential Revision: http://reviews.llvm.org/D6796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225008 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 22:04:16 +00:00
Colin LeMahieu	ab63a4c95e	[Hexagon] Adding indexed store new-value variants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225007 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 22:00:26 +00:00
Colin LeMahieu	3fa758981d	[Hexagon] Adding indexed store of immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225006 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 21:01:38 +00:00
Colin LeMahieu	65971bbfd7	[Hexagon] Adding indexed stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225005 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 20:42:23 +00:00
Peter Collingbourne	d8ae3e1fee	x86_64: Fix calls to __morestack under the large code model. Under the large code model, we cannot assume that __morestack lives within 2^31 bytes of the call site, so we cannot use pc-relative addressing. We cannot perform the call via a temporary register, as the rax register may be used to store the static chain, and all other suitable registers may be either callee-save or used for parameter passing. We cannot use the stack at this point either because __morestack manipulates the stack directly. To avoid these issues, perform an indirect call via a read-only memory location containing the address. This solution is not perfect, as it assumes that the .rodata section is laid out within 2^31 bytes of each function body, but this seems to be sufficient for JIT. Differential Revision: http://reviews.llvm.org/D6787 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225003 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 20:05:19 +00:00
Kostya Serebryany	dd890d5c5e	[asan] change _sanitizer_cov_module_init to accept int* instead of int** git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224999 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 19:29:28 +00:00
Michael Kuperstein	08c26613e1	[COFF] Don't try to add quotes to already quoted linker directives If a linker directive is already quoted, don't try to quote it again, otherwise it creates a mess. This pops up in places like: #pragma comment(linker,"\"/foo bar'\"") Differential Revision: http://reviews.llvm.org/D6792 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224998 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 19:23:48 +00:00
Colin LeMahieu	88e5659aaf	[Hexagon] Adding reg-reg indexed load forms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224997 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 18:58:47 +00:00
Colin LeMahieu	066f43435a	[Hexagon] Adding compare byte/halfword reg-reg/reg-imm forms. Adding compare to general register reg-imm form. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224991 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 17:39:24 +00:00
Colin LeMahieu	af9e1c79a5	[Hexagon] Updating constant extender def, adding alu-not instructions, compare to general register, and inverted compares. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224989 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 15:44:17 +00:00
Rafael Espindola	65300b95e6	Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224985 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 13:13:27 +00:00
Rafael Espindola	d5dd993855	Simplify test a bit. It looks like the original intent was to check which symbols were created. With macho-dump the sections were being checked just to match which symbol was in which section. llvm-objdump prints the section a symbol is in. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224980 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 05:09:17 +00:00
Peter Zotov	21a0fa44e1	[OCaml] Fix bitrot in tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224979 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 03:24:14 +00:00
Peter Zotov	91bf887d6d	[lit] Make config.llvm_lib_dir available on cmake, too. The OCaml tests require config.llvm_lib_dir to determine the OCaml package search path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224978 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 03:24:11 +00:00
Craig Topper	f8207ac705	Testcases for r224939. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224976 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 02:35:56 +00:00
Rafael Espindola	dbeada5a92	Convert test to llvm-readobj. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224973 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-30 01:34:06 +00:00
Philip Reames	35f43b8786	Semantic tests for memory invalidation at statepoints These are simply a collection of tests intended to show that information about the contents of gc references in the heap is lost at a statepoint. I've tried to write them so that they don't disallow correct transformations, while still being fairly easy to understand. p.s. Ideas for additional tests are welcome. Differential Revision: http://reviews.llvm.org/D6491 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224971 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 23:55:33 +00:00
Philip Reames	91a083c57f	Carry facts about nullness and undef across GC relocation This change implements four basic optimizations: If a relocated value isn't used, it doesn't need to be relocated. If the value being relocated is null, relocation doesn't change that. (Technically, this might be collector specific. I don't know of one which it doesn't work for though.) If the value being relocated is undef, the relocation is meaningless. If the value being relocated was known nonnull, the relocated pointer also isn't null. (Since it points to the same source language object.) I outlined other planned work in comments. Differential Revision: http://reviews.llvm.org/D6600 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224968 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 23:27:30 +00:00
Philip Reames	1714ad67bd	Refine the notion of MayThrow in LICM to include a header specific version In LICM, we have a check for an instruction which is guaranteed to execute and thus can't introduce any new faults if moved to the preheader. To handle a function which might unconditionally throw when first called, we check for any potentially throwing call in the loop and give up. This is unfortunate when the potentially throwing condition is down a rare path. It prevents essentially all LICM of potentially faulting instructions where the faulting condition is checked outside the loop. It also greatly diminishes the utility of loop unswitching since control dependent instructions - which are now likely in the loops header block - will not be lifted by subsequent LICM runs. define void @nothrow_header(i64 %x, i64 %y, i1 %cond) { ; CHECK-LABEL: nothrow_header ; CHECK-LABEL: entry ; CHECK: %div = udiv i64 %x, %y ; CHECK-LABEL: loop ; CHECK: call void @use(i64 %div) entry: br label %loop loop: ; preds = %entry, %for.inc %div = udiv i64 %x, %y br i1 %cond, label %loop-if, label %exit loop-if: call void @use(i64 %div) br label %loop exit: ret void } The current patch really only helps with non-memory instructions (i.e. divs, etc..) since the maythrow call down the rare path will be considered to alias an otherwise hoistable load. The one exception is that it does kick in for loads which are known to be invariant without regard to other possible stores, i.e. those marked with either !invarant.load metadata of tbaa 'is constant memory' metadata. Differential Revision: http://reviews.llvm.org/D6725 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224965 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 23:00:57 +00:00
Philip Reames	456b7b602c	Loading from null is valid outside of addrspace 0 This patches fixes a miscompile where we were assuming that loading from null is undefined and thus we could assume it doesn't happen. This transform is perfectly legal in address space 0, but is not neccessarily legal in other address spaces. We really should introduce a hook to control this property on a per target per address space basis. We may be loosing valuable optimizations in some address spaces by being too conservative. Original patch by Thomas P Raoux (submitted to llvm-commits), tests and formatting fixes by me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224961 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 22:46:21 +00:00
Rafael Espindola	02d187cdb9	Convert test to llvm-readobj. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224959 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 22:14:35 +00:00
Colin LeMahieu	7c58cad0ca	[Hexagon] Adding allocframe, post-increment circular immediate stores, post-increment circular register stores, and bit reversed post-increment stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224957 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 21:33:45 +00:00
Colin LeMahieu	0bd2ffae08	[Hexagon] Adding post-increment register form stores and register-immediate form stores with tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224952 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 20:44:51 +00:00
Colin LeMahieu	3dc54ee5a4	[Hexagon] Replacing the remaining postincrement stores with versions that have encoding bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224951 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 20:00:43 +00:00
Rafael Espindola	c1c55ba767	Convert test to FileCheck. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224950 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 19:50:32 +00:00
Colin LeMahieu	d25cfdb649	[Hexagon] Renaming old multiclass for removal. Adding post-increment store classes and instruction defs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224949 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 19:42:14 +00:00
Rafael Espindola	a21d820952	Add segmented stack support for DragonFlyBSD. Patch by Michael Neumann. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224936 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-29 15:47:28 +00:00
NAKAMURA Takumi	ff95215754	llvm/test/CodeGen/X86/fast-isel-call-bool.ll: Add explicit -mtriple=x86_64-unknown to satisfy x64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224907 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-28 23:37:11 +00:00
Keno Fischer	41bda9f201	[X86][ISel] Fix a regression I introduced in r224884 The else case ResultReg was not checked for validity. To my surprise, this case was not hit in any of the existing test cases. This includes a new test cases that tests this path. Also drop the `target triple` declaration from the original test as suggested by H.J. Lu, because apparently with it the test won't be run on Linux git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224901 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-28 15:20:57 +00:00
Michael Kuperstein	bfa4a373f4	[X86] Add missing memory variants to AVX false dependency breaking Adds missing memory instruction variants to AVX false dependency breaking handling. (SSE was handled in r224246) Differential Revision: http://reviews.llvm.org/D6780 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224900 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-28 13:15:05 +00:00
Andrea Di Biagio	70a7cda495	[CodeGenPrepare] Teach when it is profitable to speculate calls to @llvm.cttz/ctlz. If the control flow is modelling an if-statement where the only instruction in the 'then' basic block (excluding the terminator) is a call to cttz/ctlz, CodeGenPrepare can try to speculate the cttz/ctlz call and simplify the control flow graph. Example: \code entry: %cmp = icmp eq i64 %val, 0 br i1 %cmp, label %end.bb, label %then.bb then.bb: %c = tail call i64 @llvm.cttz.i64(i64 %val, i1 true) br label %end.bb end.bb: %cond = phi i64 [ %c, %then.bb ], [ 64, %entry] \code In this example, basic block %then.bb is taken if value %val is not zero. Also, the phi node in %end.bb would propagate the size-of in bits of %val only if %val is equal to zero. With this patch, CodeGenPrepare will try to hoist the call to cttz from %then.bb into basic block %entry only if cttz is cheap to speculate for the target. Added two new hooks in TargetLowering.h to let targets customize the behavior (i.e. decide whether it is cheap or not to speculate calls to cttz/ctlz). The two new methods are 'isCheapToSpeculateCtlz' and 'isCheapToSpeculateCttz'. By default, both methods return 'false'. On X86, method 'isCheapToSpeculateCtlz' returns true only if the target has LZCNT. Method 'isCheapToSpeculateCttz' only returns true if the target has BMI. Differential Revision: http://reviews.llvm.org/D6728 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224899 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-28 11:07:35 +00:00
Elena Demikhovsky	8499a501e4	Scalarizer for masked load and store intrinsics. Masked vector intrinsics are a part of common LLVM IR, but they are really supported on AVX2 and AVX-512 targets. I added a code that translates masked intrinsic for all other targets. The masked vector intrinsic is converted to a chain of scalar operations inside conditional basic blocks. http://reviews.llvm.org/D6436 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224897 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-28 08:54:45 +00:00
David Majnemer	bd64447bf3	PowerPC: CTR shouldn't fire if a TLS call is in the loop Determining the address of a TLS variable results in a function call in certain TLS models. This means that a simple ICmpInst might actually result in invalidating the CTR register. In such cases, do not attempt to rely on the CTR register for loop optimization purposes. This fixes PR22034. Differential Revision: http://reviews.llvm.org/D6786 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224890 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-27 19:45:38 +00:00
Keno Fischer	cc80af1b4f	[FastIsel][X86] Fix invalid register replacement for bool args Summary: Consider the following IR: %3 = load i8* undef %4 = trunc i8 %3 to i1 %5 = call %jl_value_t.0* @foo(..., i1 %4, ...) ret %jl_value_t.0* %5 Bools (that are the result of direct truncs) are lowered as whatever the argument to the trunc was and a "and 1", causing the part of the MBB responsible for this argument to look something like this: %vreg8<def,tied1> = AND8ri %vreg7<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg8,%vreg7 Later, when the load is lowered, it will insert %vreg15<def> = MOV8rm %vreg14, 1, %noreg, 0, %noreg; mem:LD1[undef] GR8:%vreg15 GR64:%vreg14 but remember to (at the end of isel) replace vreg7 by vreg15. Now for the bug. In fast isel lowering, we mistakenly mark vreg8 as the result of the load instead of the trunc. This adds a fixup to have vreg8 replaced by whatever the result of the load is as well, so we end up with %vreg15<def,tied1> = AND8ri %vreg15<kill,tied0>, 1, %EFLAGS<imp-def>; GR8:%vreg15 which is an SSA violation and causes problems later down the road. This fixes PR21557. Test Plan: Test test case from PR21557 is added to the test suite. Reviewers: ributzka Reviewed By: ributzka Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6245 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224884 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-27 13:10:15 +00:00
Rafael Espindola	aceb47b808	Convert test to llvm-readobj. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224872 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 22:47:39 +00:00
Colin LeMahieu	17946361cc	[Hexagon] Adding auto-incrementing loads with and without byte reversal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224871 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 21:09:25 +00:00
Colin LeMahieu	de2cee5556	[Hexagon] Adding locked loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224870 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 20:42:27 +00:00
Colin LeMahieu	6ff5e4862d	[Hexagon] Adding deallocframe and circular addressing loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224869 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 20:30:58 +00:00
Colin LeMahieu	ffba450190	[Hexagon] Adding remaining post-increment instruction variants. Removing unused classes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224868 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 19:31:46 +00:00
Colin LeMahieu	a46bee194d	[Hexagon] Adding post-increment unsigned byte loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224867 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 19:12:11 +00:00
Colin LeMahieu	3c52b7b9f2	[Hexagon] Adding post-increment signed byte loads with tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224866 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 18:57:13 +00:00
Rafael Espindola	21b9b20f36	Use llvm-readobj. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224864 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 18:22:05 +00:00
Craig Topper	a996db696b	[X86] Add the debug registers DR8-DR15 so we can assemble and disassemble references to them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 18:20:05 +00:00
Craig Topper	6eb3e3ce10	[X86] Don't fail disassembly if REX.R/REX.B is used on an MMX register. Similar fix to not fail to disassembler CR9-CR15 references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224861 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 18:19:44 +00:00
Timur Iskhodzhanov	f4076dc995	Band-aid fix for PR22032: don't emit DWARF debug info if AddressSanitizer is enabled on Windows git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224860 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 17:00:51 +00:00
Rafael Espindola	3bea6d7604	No need to run llvm-as. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224859 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 16:42:47 +00:00
David Majnemer	7627d9c229	InstCombine: Infer nuw for multiplies A multiply cannot unsigned wrap if there are bitwidth, or more, leading zero bits between the two operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224849 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 09:50:35 +00:00
David Majnemer	998ae69abe	InstCombe: Infer nsw for multiplies We already utilize this logic for reducing overflow intrinsics, it makes sense to reuse it for normal multiplies as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 09:10:14 +00:00
Craig Topper	654a66dbd3	Teach disassembler to handle illegal immediates on (v)cmpps/pd/ss/sd instructions. Instead of rejecting we'll just generate the _alt forms that don't try to alter the mnemonic. While I'm here, merge some common code in the Instruction printers for the condition code replacement and fix the mask on SSE to be 3-bits instead of 4. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224846 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-26 06:36:28 +00:00
Hal Finkel	d7b2788e51	[PowerPC] [FastISel] i1 constants must be zero extended When materializing constant i1 values, they must be zero extended. We represent i1 values as [0, 1], not [0, -1], in i32 registers. As it turns out, this code path was dead for i1 values prior to r216006 (which is why this did not manifest in miscompiles until recently). Fixes -O0 self-hosting on PPC64/Linux. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224842 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-25 23:08:25 +00:00
Elena Demikhovsky	b31322328a	Masked Load/Store - Changed the order of parameters in intrinsics. No functional changes. The documentation is coming. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224829 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-25 07:49:20 +00:00
David Majnemer	e277a13a71	CodeGen: Error on redefinitions instead of asserting It's possible to have a prior definition of a symbol in module asm. Raise an error instead of crashing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224828 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-24 23:06:55 +00:00
David Majnemer	d36cad9914	CodeGen: Allow aliases to be overridden by variables git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224827 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-24 22:44:29 +00:00
David Majnemer	e54eacce75	MC: Label definitions are permitted after .set directives .set directives may be overridden by other .set directives as well as label definitions. This fixes PR22019. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224811 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-24 10:27:50 +00:00
Saleem Abdulrasool	3681929e11	IAS: correct debug line info for asm macros Correct the line information generation for preprocessed assembly. Although we tracked the source information for the macro instantiation, we failed to account for the fact that we were instantiating a macro, which is populated into a new buffer and that the line information would be relative to the definition rather than the actual instantiation location. This could cause the line number associated with the statement to be very high due to wrapping of the difference calculated for the preprocessor line information emitted into the stream. Properly calculate the line for the macro instantiation, referencing the line where the macro is actually used as GCC/gas do. The test case uses x86, though the same problem exists on any other target using the LLVM IAS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224810 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-24 06:32:43 +00:00
David Majnemer	4714bfa1db	MC: Don't emit .no_dead_strip on targets which don't support it git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224808 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-24 04:11:42 +00:00

... 2 3 4 5 6 ...

28004 Commits