llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-22 10:29:35 +00:00

Author	SHA1	Message	Date
Hal Finkel	b6bb7db62b	[PowerPC] Fix calls to non-function objects Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225617 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 04:34:47 +00:00
David Majnemer	85a0cb9bf2	Revert most of r225597 We can't rely on a DataLayout enlightened constant folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225599 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 07:29:51 +00:00
David Majnemer	d2f4460ee7	X86: Properly decode shuffle masks when the constant pool type is weird It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225597 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 05:08:57 +00:00
Saleem Abdulrasool	776673ea09	X86: teach X86TargetLowering about L,M,O constraints Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225596 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 04:39:24 +00:00
Chandler Carruth	561088eb5d	[x86] Remove some windows line endings that snuck into the tests here. Folks on Windows, remember to set up your subversion to strip these when submitting... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225593 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 01:36:20 +00:00
Sanjoy Das	7f0da20b97	Fix PR22179. We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225591 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 23:41:24 +00:00
Simon Pilgrim	47abf0e3da	[X86][SSE] Improved (v)insertps shuffle matching In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225589 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 19:45:33 +00:00
Hal Finkel	9ae5b7a40a	[PowerPC] Mark zext of a small scalar load as free This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225584 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 08:21:59 +00:00
Justin Hibbits	1c6936f6d7	Fully fix Bug #22115 . Summary: In the previous commit, the register was saved, but space was not allocated. This resulted in the parameter save area potentially clobbering r30, leading to nasty results. Test Plan: Tests updated Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6906 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225573 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 01:57:21 +00:00
Simon Pilgrim	34630b6ea9	[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable. Differential Revision: http://reviews.llvm.org/D6878 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225551 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 22:03:19 +00:00
Daniel Sanders	8d7b0bdcf0	[mips] Add support for accessing $gp as a named register. Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225529 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 17:21:30 +00:00
Matthias Braun	c41acffe22	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 03:01:31 +00:00
Hal Finkel	4e98296890	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225493 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 01:34:30 +00:00
Hal Finkel	b7c01bf403	[PowerPC] Mark all instructions as non-cheap for MachineLICM MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225471 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:11:49 +00:00
Akira Hatanaka	40cd57eb5c	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:44:50 +00:00
Justin Hibbits	77e85a150c	Add saving and restoring of r30 to the prologue and epilogue, respectively Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225450 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:47:19 +00:00
Kristof Beyls	d1cee9b3bc	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225446 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:09:14 +00:00
Tom Stellard	9a6e4f08fe	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225445 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:08:17 +00:00
Elena Demikhovsky	6e8b53da17	Masked Load/Store - fixed a bug in type legalization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:29:19 +00:00
Michael Kuperstein	477eba5f81	Fix a think-o in the test for r225438. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225440 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:05:02 +00:00
Michael Kuperstein	0858c28ca8	[X86] Don't try to generate direct calls to TLS globals The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225438 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 11:50:58 +00:00
Craig Topper	cb964a5c58	Fix test case I missed in r225432. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225434 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 07:57:27 +00:00
Quentin Colombet	9d60e0ff0a	[RegAllocGreedy] Introduce a late pass to repair broken hints. A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225422 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 01:16:39 +00:00
Matthias Braun	a065cf13cd	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225415 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 23:58:38 +00:00
Philip Reames	a7f8f932a6	[GC] improve testing around gc.relocate and fix a test Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225412 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:48:01 +00:00
Tom Stellard	a36b682c17	R600/SI: Commute instructions to enable more folding opportunities git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225410 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:44:19 +00:00
Tom Stellard	a3ee583339	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:18:27 +00:00
Olivier Sallenave	033a537a84	More FMA folding opportunities. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:54:17 +00:00
Tom Stellard	f7587043ef	R600/SI: Add a V_MOV_B64 pseudo instruction This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225373 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:27:25 +00:00
Tom Stellard	546520a727	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225369 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:56:17 +00:00
Philip Reames	28fa9e1e9f	Introduce an example statepoint GC strategy This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.) Most of the validation logic added here as proof of concept will soon move in to the Verifier. That move is dependent on http://reviews.llvm.org/D6811 There was discussion in the review thread about addrspace(1) being reserved for something. I'm going to follow up on a seperate llvmdev thread. If needed, I'll update all the code at once. Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others. Differential Revision: http://reviews.llvm.org/D6808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:07:50 +00:00
David Majnemer	60812c05e7	X86: Allow the stack probe size to be configurable per function LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225360 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 18:14:07 +00:00
Ahmed Bougacha	d412c608fc	[X86] Teach FCOPYSIGN lowering to recognize constant magnitudes. For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225357 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 17:33:03 +00:00
Charlie Turner	7fbbc81d65	[ARM] Add missing Tag_DIV_use tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225348 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 11:37:40 +00:00
Karthik Bhat	f2b3638c3d	Revert r225165 and r225169 Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 06:34:34 +00:00
Matt Arsenault	6a72b20325	R600/SI: Add combine for isinfinite pattern git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225310 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:46 +00:00
Matt Arsenault	42d9f7cf0a	R600/SI: Pattern match isinf to v_cmp_class instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225307 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:41 +00:00
Matt Arsenault	a5b2b64292	R600/SI: Add basic DAG combines for fp_class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225306 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:39 +00:00
Matt Arsenault	b6520ab625	R600/SI: Add class intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225305 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:37 +00:00
Rafael Espindola	f907a26bc2	Change the .ll syntax for comdats and add a syntactic sugar. In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:55:16 +00:00
Hal Finkel	8e9ba0e588	[PowerPC] Reuse a load operand in int->fp conversions int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225301 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:31:02 +00:00
Tom Stellard	bac89f3dd2	R600/SI: Insert s_waitcnt before s_barrier instructions. This ensures that all memory operations are complete when all threads reach the barrier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225290 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:52:07 +00:00
Tom Stellard	1f996fa36b	R600/SI: Add a stub GCNTargetMachine This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 18:00:21 +00:00
Andrea Di Biagio	e46783d5b7	[CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz. This patch improves the logic added at revision 224899 (see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz. The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow grap modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz. However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz. With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/truncated in the same basic block, and the zext/trunc instruction is "free" for the target. Added new test cases to CodeGen/X86/cttz-ctlz.ll Differential Revision: http://reviews.llvm.org/D6853 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225274 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:41:18 +00:00
Hal Finkel	15914b5c22	[PowerPC] Add a regression test for r225251 In r225251, I removed an old entry from the README.txt file. While there are several contributing factors (including pieces in Clang's ABI code), upon further reflection, the backend part deserves a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225268 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:46:37 +00:00
Colin LeMahieu	a24e012976	[Hexagon] Adding dealloc_return encoding and absolute address stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225267 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:15:15 +00:00
David Majnemer	b3065539bd	X86: Don't make illegal GOTTPOFF relocations "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225250 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 07:12:52 +00:00
Hal Finkel	10ae865847	[PowerPC] Improve int_to_fp(fp_to_int(x)) combining The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x)) without a load/store pair is updated here with support for unsigned integers, and to support single-precision values without a third rounding step, on newer cores with the appropriate instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225248 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 06:01:57 +00:00
Hal Finkel	38c3e2f5c5	[PowerPC] Fix test to pass on Darwin hosts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225220 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 23:17:43 +00:00
Hal Finkel	fcfee17911	[PowerPC] Convert a README.txt entry into a better test We now produce the desired code as noted in the README.txt file (no spurious or). Remove the README entry and improve the regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225214 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:53:52 +00:00

1 2 3 4 5 ...

12378 Commits