llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-07-30 03:29:23 +00:00

Author	SHA1	Message	Date
Alexey Samsonov	e0f5aedf97	Fix tests that failed on i686-win32 after r160248: 1. FileCheck-ize epilogue.ll and allow another asm instruction to restore %rsp. 2. Remove check in widen_arith-3.ll that was hitting instruction in epilogue instead of vector add. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160274 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-16 14:33:36 +00:00
Nadav Rotem	d93ea88cde	Fix a bug in the 3-address conversion of LEA when one of the operands is an undef virtual register. The problem is that ProcessImplicitDefs removes the definition of the register and marks all uses as undef. If we lose the undef marker then we get a register which has no def, is not marked as undef. The live interval analysis does not collect information for these virtual registers and we crash in later passes. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160260 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-16 10:52:25 +00:00
Alexey Samsonov	99a92f269d	This CL changes the function prologue and epilogue emitted on X86 when stack needs realignment. It is intended to fix PR11468. Old prologue and epilogue looked like this: push %rbp mov %rsp, %rbp and $alignment, %rsp push %r14 push %r15 ... pop %r15 pop %r14 mov %rbp, %rsp pop %rbp The problem was to reference the locations of callee-saved registers in exception handling: locations of callee-saved had to be re-calculated regarding the stack alignment operation. It would take some effort to implement this in LLVM, as currently MachineLocation can only have the form "Register + Offset". Funciton prologue and epilogue are now changed to: push %rbp mov %rsp, %rbp push %14 push %15 and $alignment, %rsp ... lea -$size_of_saved_registers(%rbp), %rsp pop %r15 pop %r14 pop %rbp Reviewed by Chad Rosier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160248 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-16 06:54:09 +00:00
Nadav Rotem	d896e24299	Teach getTargetVShiftNode about TargetConstant nodes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160234 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-15 20:27:43 +00:00
NAKAMURA Takumi	a2a179dd7d	llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Rewrite expressions to fit various targets. - Make sure existence of "barrier". - Confirm reload corresponding to spill. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160232 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-15 14:38:35 +00:00
Nadav Rotem	aec9f382dd	Rename VBROADCASTSDrm into VBROADCASTSDYrm to match the naming convention. Allow the folding of vbroadcastRR to vbroadcastRM, where the memory operand is a spill slot. PR12782. Together with Michael Kuperstein <michael.m.kuperstein@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160230 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-15 12:26:30 +00:00
Nadav Rotem	65f489fd7d	AVX: Fix a bug in getTargetVShiftNode. The shift amount has to be a 128bit vector with the same element type as the input vector. This is needed because of the patterns we have for the VP[SLL/SRA/SRL][W/D/Q] instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160222 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-14 22:26:05 +00:00
Nadav Rotem	b7e230d999	Add a dagcombine optimization to convert concat_vectors of undefs into a single undef. The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160221 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-14 21:30:27 +00:00
Duncan Sands	ab6e47bee6	Restrict this to x86, hopefully fixing ARM buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160163 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-13 07:02:00 +00:00
Benjamin Kramer	feae00a68e	Give the rdrand instructions a SideEffect flag and a chain so MachineCSE and MachineLICM don't touch it. I already had the necessary things in place for IR-level passes but missed the machine passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160137 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 18:14:57 +00:00
Nadav Rotem	4dff258bfb	The LIT tests below do not specify the exact cpu model and fail on AVX2 machines, because we select different instructions such as vbroadcast, new shuffles, etc. Patch by Michael Liao. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160129 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 13:45:15 +00:00
NAKAMURA Takumi	223875e3f8	llvm/test/CodeGen/X86/rdrand.ll: Relax expression corresponding to Win64 CC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160124 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 10:22:57 +00:00
Benjamin Kramer	51bf934e4d	Use %s instead of the explicit name, the latter doesn't work in out-of-tree builds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160120 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 09:36:29 +00:00
Benjamin Kramer	b9bee04995	Add intrinsics for Ivy Bridge's rdrand instruction. The rdrand/cmov sequence is the same that is emitted by both GCC and ICC. Fixes PR13284. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160117 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 09:31:43 +00:00
Duncan Sands	4e8982a34d	The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of the input vector, it can be bigger (this is helpful for powerpc where <2 x i16> is a legal vector type but i16 isn't a legal type, IIRC). However this wasn't being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220. Lightly tweaked version of a patch by Michael Liao. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160116 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 09:01:35 +00:00
Craig Topper	5aba78bd80	Update GATHER instructions to support 2 read-write operands. Patch from myself and Manman Ren. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160110 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-12 06:52:41 +00:00
Manman Ren	84ae7e9034	X86: Update to peephole optimization to move Movr0 before (Sub, Cmp) pair. When Movr0 is between sub and cmp, we move Movr0 before sub if it enables removal of Cmp. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160066 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-11 19:35:12 +00:00
Benjamin Kramer	597f2950d8	PR13326: Fix a subtle edge case in the udiv -> magic multiply generator. This caused 6 of 65k possible 8 bit udivs to be wrong. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160058 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-11 18:31:59 +00:00
Nadav Rotem	5cd95e1478	When ext-loading and trunc-storing vectors to memory, on x86 32bit systems, allow loads/stores of 64bit values from xmm registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160044 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-11 13:27:05 +00:00
Chad Rosier	542e35f62d	Add newline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160006 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-10 17:57:00 +00:00
Chad Rosier	3d3c75c665	Add test case accidentally omitted from r160002. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160004 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-10 17:49:39 +00:00
Chad Rosier	3f0dbab963	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. Basically, this is a reapplication of r158087 with a few fixes. Specifically, (1) the stack pointer is restored from the base pointer before popping callee-saved registers and (2) in obscure cases (see comments in patch) we must cache the value of the original stack adjustment in the prologue and apply it in the epilogue. rdar://11496434 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160002 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-10 17:45:53 +00:00
Nadav Rotem	2dd83eb1ab	Improve the loading of load-anyext vectors by allowing the codegen to load multiple scalars and insert them into a vector. Next, we shuffle the elements into the correct places, as before. Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the migration of bitcasts happened too late in the SelectionDAG process. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159991 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-10 13:25:08 +00:00
Manman Ren	6209364834	X86: implement functions to analyze & synthesize CMOV\|SET\|Jcc getCondFromSETOpc, getCondFromCMovOpc, getSETFromCond, getCMovFromCond No functional change intended. If we want to update the condition code of CMOV\|SET\|Jcc, we first analyze the opcode to get the condition code, then update the condition code, finally synthesize the new opcode form the new condition code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159955 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-09 18:57:12 +00:00
Manman Ren	2d4215f759	X86: Fix optimizeCompare to correctly check safe condition. It is safe if EFLAGS is killed or re-defined. When we are done with the basic block, check whether EFLAGS is live-out. Do not optimize away cmp if EFLAGS is live-out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159888 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-07 03:34:46 +00:00
Manman Ren	2af66dc51a	X86: peephole optimization to remove cmp instruction For each Cmp, we check whether there is an earlier Sub which make Cmp redundant. We handle the case where SUB operates on the same source operands as Cmp, including the case where the two source operands are swapped. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159838 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-06 17:36:20 +00:00
Chad Rosier	fd065bbed1	[fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159837 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-06 17:33:39 +00:00
Duncan Sands	4c3916f840	Attempt to fix windows buildbots. Patch by James Benton. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159826 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-06 14:43:16 +00:00
NAKAMURA Takumi	365f1b84d3	test/CodeGen/X86/sext-setcc-self.ll: Mark it as XFAIL: cygwin,mingw32,win32. Investigating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159820 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-06 12:12:39 +00:00
Duncan Sands	e7de3b29f7	Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1 booleans. Patch by James Benton. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159739 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-05 09:32:46 +00:00
Jakob Stoklund Olesen	b872078701	Ensure CopyToReg nodes are always glued to the call instruction. The CopyToReg nodes that set up the argument registers before a call must be glued to the call instruction. Otherwise, the scheduler may emit the physreg copies long before the call, causing long live ranges for the fixed registers. Besides disabling good register allocation, that can also expose problems when EmitInstrWithCustomInserter() splits a basic block during the live range of a physreg. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159721 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-04 19:28:31 +00:00
Rafael Espindola	25dd5fc1cd	Add a testcase for pr13209. It is not a great test, but it still fails if 159509 and 159479 are reverted. It would be really nice to be able to run just the coalescer :-( git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159715 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-04 16:06:00 +00:00
Jakob Stoklund Olesen	59bde4d8a1	Add early if-conversion support to X86. Implement the TII hooks needed by EarlyIfConversion to create cmov instructions and estimate their latency. Early if-conversion is still not enabled by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159695 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-04 00:09:58 +00:00
NAKAMURA Takumi	ea957f0c56	test/CodeGen/X86/sincos.ll: FileCheck-ize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159639 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-03 03:59:22 +00:00
NAKAMURA Takumi	a16d8c30cc	test/CodeGen/X86/fabs.ll: FileCheck-ize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159638 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-03 03:59:15 +00:00
NAKAMURA Takumi	40b7e7eb97	test/CodeGen/X86/2007-09-05-InvalidAsm.ll: FileCheck-ize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159637 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-03 03:59:08 +00:00
NAKAMURA Takumi	0e0d62ebd9	test/CodeGen/X86/2004-03-30-Select-Max.ll: FileCheck-ize. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159636 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-03 03:58:59 +00:00
Chandler Carruth	1de43ede89	Fix the remaining TCL-style quotes found in the testsuite. This is another mechanical change accomplished though the power of terrible Perl scripts. I have manually switched some "s to 's to make escaping simpler. While I started this to fix tests that aren't run in all configurations, the massive number of tests is due to a really frustrating fragility of our testing infrastructure: things like 'grep -v', 'not grep', and 'expected failures' can mask broken tests all too easily. Essentially, I'm deeply disturbed that I can change the testsuite so radically without causing any change in results for most platforms. =/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159547 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-02 19:09:46 +00:00
Chandler Carruth	49589f0d0e	Convert the uses of '\|&' to use '2>&1 \|' instead, which works on old versions of Bash. In addition, I can back out the change to the lit built-in shell test runner to support this. This should fix the majority of fallout on Darwin, but I suspect there will be a few straggling issues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159544 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-02 18:37:59 +00:00
Chandler Carruth	4177e6fff5	Convert all tests using TCL-style quoting to use shell-style quoting. This was done through the aid of a terrible Perl creation. I will not paste any of the horrors here. Suffice to say, it require multiple staged rounds of replacements, state carried between, and a few nested-construct-parsing hacks that I'm not proud of. It happens, by luck, to be able to deal with all the TCL-quoting patterns in evidence in the LLVM test suite. If anyone is maintaining large out-of-tree test trees, feel free to poke me and I'll send you the steps I used to convert things, as well as answer any painful questions etc. IRC works best for this type of thing I find. Once converted, switch the LLVM lit config to use ShTests the same as Clang. In addition to being able to delete large amounts of Python code from 'lit', this will also simplify the entire test suite and some of lit's architecture. Finally, the test suite runs 33% faster on Linux now. ;] For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159525 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-02 12:47:22 +00:00
Elena Demikhovsky	8f40f7b867	Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159504 91177308-0d34-0410-b5e6-96231b3b80d8	2012-07-01 06:12:26 +00:00
Jakob Stoklund Olesen	8ccaad526a	Clear kill flags in InstrEmitter::EmitSubregNode(). When a local virtual register is made global, make sure to clear any existing kill flags. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159461 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-29 21:00:03 +00:00
Rafael Espindola	94e3b388e5	In the initial exec mode we always do a load to find the address of a variable. Before this patch in pic 32 bit code we would add the global base register and not load from that address. This is a really old bug, but before the introduction of the tls attributes we would never select initial exec for pic code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159409 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-29 04:22:35 +00:00
Manman Ren	40307c7dbe	X86: add more GATHER intrinsics in LLVM Corrected type for index of llvm.x86.avx2.gather.d.pd.256 from 256-bit to 128-bit. Corrected types for src\|dst\|mask of llvm.x86.avx2.gather.q.ps.256 from 256-bit to 128-bit. Support the following intrinsics: llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256 llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159402 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-29 00:54:20 +00:00
Manman Ren	1f7a1b68a0	X86: add GATHER intrinsics (AVX2) in LLVM Support the following intrinsics: llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256 llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256 Modified Disassembler to handle VSIB addressing mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159221 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-26 19:47:59 +00:00
Elena Demikhovsky	1596373671	Shuffle optimization for AVX/AVX2. The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction. Before: vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3] vextractf128 $1, %ymm1, %xmm1 vextractf128 $1, %ymm0, %xmm0 vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3] vpermilps $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3] vinsertf128 $1, %xmm0, %ymm2, %ymm0 After: vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4] vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4] vunpcklps %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159188 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-26 08:04:10 +00:00
Andrew Trick	c9b1e25493	Enable the new LoopInfo algorithm by default. The primary advantage is that loop optimizations will be applied in a stable order. This helps debugging and unit test creation. It is also a better overall implementation without pathologically bad performance on deep functions. On large functions (llvm-stress --size=200000 \| opt -loops) Before: 0.1263s After: 0.0225s On deep functions (after tweaking llvm-stress, thanks Nadav): Before: 0.2281s After: 0.0227s See r158790 for more comments. The loop tree is now consistently generated in forward order, but loop passes are applied in reverse order over the program. If we have a loop optimization that prefers forward order, that can easily be achieved by adding a different type of LoopPassManager. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159183 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-26 04:11:38 +00:00
Eli Friedman	52d418df5d	Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159176 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-25 23:42:33 +00:00
Jakob Stoklund Olesen	5984d2b31f	Run ProcessImplicitDefs on SSA form where it can be much simpler. Implicitly defined virtual registers can simply have the <undef> bit set on all uses, and copies can be turned into implicit defs recursively. Physical registers are a bit trickier. We handle the common case where a physreg def is used by a nearby instruction in the same basic block. For more complicated cases, just leave the IMPLICIT_DEF instruction in. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159149 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen	82d58b147f	%RCX is not a function live-out in eh.return functions. The function live-out registers must be live at all function returns, and %RCX is only used by eh.return. When a function also has a normal return, only %RAX holds a return value. This fixes PR13188. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159116 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-24 15:53:01 +00:00
Hans Wennborg	ce718ff9f4	Extend the IL for selecting TLS models (PR9788) This allows the user/front-end to specify a model that is better than what LLVM would choose by default. For example, a variable might be declared as @x = thread_local(initialexec) global i32 42 if it will not be used in a shared library that is dlopen'ed. If the specified model isn't supported by the target, or if LLVM can make a better choice, a different model may be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159077 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-23 11:37:03 +00:00
Chad Rosier	e5457d2116	FileCheckize tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159044 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-22 23:04:02 +00:00
Evan Cheng	c90a1fcf9f	EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@159023 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-22 20:14:46 +00:00
Jakob Stoklund Olesen	e208c49172	Functions calling __builtin_eh_return must have a frame pointer. The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame pointer exists, but the frame pointer was forced by the presence of llvm.eh.unwind.init which isn't guaranteed. If llvm.eh.unwind.init is actually required in functions calling eh.return (is it?), we should diagnose that instead of emitting bad machine code. This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158961 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-22 03:04:27 +00:00
Jakob Stoklund Olesen	c4118452bc	Remove the -live-regunits command line option. Register allocators depend on it being permanently enabled now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158873 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-20 23:31:34 +00:00
Jakob Stoklund Olesen	7824152557	Only update regunit live ranges that have been precomputed. Regunit live ranges are computed on demand, so when mi-sched calls handleMove, some regunits may not have live ranges yet. That makes updating them easier: Just skip the non-existing ranges. They will be computed correctly from the rescheduled machine code when they are needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158831 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-20 18:00:57 +00:00
Craig Topper	703c38bf58	Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158792 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-20 05:39:26 +00:00
Rafael Espindola	565bdbf598	really add a triple :-( git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158696 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-19 02:17:35 +00:00
Rafael Espindola	e08c1347d9	Add a triple to the test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158695 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-19 01:42:34 +00:00
Rafael Espindola	d6b43a317e	Move the support for using .init_array from ARM to the generic TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM, on X86 it is not easy to find out if .init_array should be used or not, so the decision is made via TargetOptions and defaults to off. Add a command line option to llc that enables it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158692 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-19 00:48:28 +00:00
Chandler Carruth	457dfbac8a	Add a regression test for the bug exposed by r158087, which has been temporarily reverted. This test is annoyingly overspecified, but I don't know of another way to thoroughly test the saving and restoring of the registers. While this will have to be adjusted even with the issue fixed in order to re-apply r158087, those adjustments should very clearly indicate that it is still correct (%esp getting restored prior to pops), whereas without it, this case can easily slip under the radar. Still, any suggestions for improvements are very welcome. All credit to Matt Beaumont-Gay for reducing this out of an insane Address Sanitizer crash to a reasonably small seg-faulting C program when built with -mstackrealign. I just reduced it to IR, which was much simpler. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158656 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-18 09:15:04 +00:00
Chandler Carruth	43369249e7	Temporarily revert r158087. This patch causes problems when both dynamic stack realignment and dynamic allocas combine in the same function. With this patch, we no longer build the epilog correctly, and silently restore registers from the wrong position in the stack. Thanks to Matt for tracking this down, and getting at least an initial test case to Chad. I'm going to try to check a variation of that test case in so we can easily track the fixes required. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158654 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-18 07:03:12 +00:00
Craig Topper	cc95b57d42	Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158396 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-13 07:18:53 +00:00
Craig Topper	c29106b36f	Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158278 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-09 16:46:13 +00:00
Jakob Stoklund Olesen	6660ed5f2f	Don't run RAFast in the optimizing regalloc pipeline. The fast register allocator is not supposed to work in the optimizing pipeline. It doesn't make sense to compute live intervals, run full copy coalescing, and then run RAFast. Fast register allocation in the optimizing pipeline is better done by RABasic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158242 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-08 23:15:12 +00:00
Manman Ren	6620ccf5d8	Test case for r158160 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158218 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-08 18:42:37 +00:00
Manman Ren	9236362a64	X86: optimize generated code for integer ABS This patch will generate the following for integer ABS: movl %edi, %eax negl %eax cmovll %edi, %eax INSTEAD OF movl %edi, %ecx sarl $31, %ecx leal (%rdi,%rcx), %eax xorl %ecx, %eax There exists a target-independent DAG combine for integer ABS, which converts integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. This is implemented in PerformXorCombine. rdar://10695237 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158175 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-07 22:39:10 +00:00
Rafael Espindola	c07f5bbd3b	Use a base register instead of an index register with the local dynamic model. Fixes pr13048. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158158 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-07 18:39:19 +00:00
Manman Ren	87253c2ebd	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158126 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-07 00:42:47 +00:00
Manman Ren	2afde7782d	Revert r157755. The commit is intended to fix rdar://11540023. It is implemented as part of peephole optimization. We can actually implement this in the SelectionDAG lowering phase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158122 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-06 23:53:03 +00:00
Chad Rosier	a97b180fc4	Add support for dynamic stack realignment in the presence of dynamic allocas on X86. rdar://11496434 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@158087 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-06 17:37:40 +00:00
Nadav Rotem	fcb2c3cf5e	Remove the "-promote-elements" flag. This flag is now enabled by default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157925 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-04 11:27:21 +00:00
Craig Topper	a15f9d5311	Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157903 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-03 18:58:46 +00:00
Craig Topper	529ce07c5f	Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157898 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-03 07:26:46 +00:00
Manman Ren	c73ea9102b	Revert r157831 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157896 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-03 03:14:24 +00:00
Craig Topper	57ae246a6a	Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157895 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-03 01:40:43 +00:00
Manman Ren	73c2f7f5ed	X86: peephole optimization to remove cmp instruction This patch will optimize the following: sub r1, r3 cmp r3, r1 or cmp r1, r3 bge L1 TO sub r1, r3 bge L1 or ble L1 If the branch instruction can use flag from "sub", then we can eliminate the "cmp" instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157831 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 19:49:33 +00:00
Chris Lattner	2b76473929	testcase for PR13006, thanks to Duncan for filing it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157824 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 18:19:46 +00:00
Hans Wennborg	f0234fcbc9	Implement the local-dynamic TLS model for x86 (PR3985) This implements codegen support for accesses to thread-local variables using the local-dynamic model, and adds a clean-up pass so that the base address for the TLS block can be re-used between local-dynamic access on an execution path. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157818 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 16:27:21 +00:00
Craig Topper	3a8172ad8d	Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157804 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 06:07:48 +00:00
Chris Lattner	f59e4e3452	enhance the logic for looking through tailcalls to look through transparent casts in multiple-return value scenarios, like what happens on X86-64 when returning small structs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157800 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 05:29:15 +00:00
Chris Lattner	5b0d946537	enhance getNoopInput to know about vector<->vector bitcasts of legal types, as well as int<->ptr casts. This allows us to tailcall functions with some trivial casts between the call and return (i.e. because the return types disagree). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157798 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 05:16:33 +00:00
Chris Lattner	09c14c0836	add some simple 64-bit tail call tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157797 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 05:03:31 +00:00
Chris Lattner	e8ea60b8ba	merge some tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157795 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 05:00:54 +00:00
Chris Lattner	e109648880	rename test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157794 91177308-0d34-0410-b5e6-96231b3b80d8	2012-06-01 04:58:50 +00:00
Manman Ren	91c5346d91	X86: replace SUB with CMP if possible This patch will optimize the following movq %rdi, %rax subq %rsi, %rax cmovsq %rsi, %rdi movq %rdi, %rax to cmpq %rsi, %rdi cmovsq %rsi, %rdi movq %rdi, %rax Perform this optimization if the actual result of SUB is not used. rdar: 11540023 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157755 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-31 17:20:29 +00:00
Elena Demikhovsky	177cf1e1a3	Added FMA3 Intel instructions. I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks. I added tests for GodeGen and intrinsics. I did not change llvm.fma.f32/64 - it may be done later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157737 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-31 09:20:20 +00:00
Craig Topper	0559a2f8ae	Add intrinsic for pclmulqdq instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157731 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-31 04:37:40 +00:00
Jakob Stoklund Olesen	9cda1be0aa	Prioritize smaller register classes for urgent evictions. It helps compile exotic inline asm. In the test case, normal GR32 virtual registers use up eax-edx so the final GR32_ABCD live range has no registers left. Since all the live ranges were tiny, we had no way of prioritizing the smaller register class. This patch allows tiny unspillable live ranges to be evicted by tiny unspillable live ranges from a smaller register class. <rdar://problem/11542429> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157715 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-30 21:46:58 +00:00
Chris Lattner	f186df0d3e	it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157699 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-30 18:08:02 +00:00
Chris Lattner	5aaabbfe62	Extend the (abi-irrelevant) return convention to be able to return more than two values in integer registers. This is already supported by the fastcc convention, but it doesn't hurt to support it in the standard conventions as well. In cases where we can cheat at the calling convention, this allows us to avoid returning things through memory in more cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157698 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-30 17:50:14 +00:00
Benjamin Kramer	1386e9b7b1	Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions. This required light surgery on the assembler and disassembler because the instructions use an uncommon encoding. They are the only two instructions in x86 that use register operands and two immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157634 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-29 19:05:25 +00:00
Chris Lattner	c32cef6aa1	These tests used intrinsics with the wrong prototype. They weren't caught because the old verifier just checked that something "was a pointer", but not that the pointee was correct. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157544 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-27 19:35:41 +00:00
Benjamin Kramer	c511b2a5a1	SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights. SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move the most likely condition to the front so it is checked first and the others can be skipped. This is currently not as effective as it could be because SimplifyCFG destroys profiling metadata when merging branches and switches. Merging branch weight metadata is tricky though. This code touches at most 3 cases so I didn't use a proper sorting algorithm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157521 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-26 20:01:32 +00:00
NAKAMURA Takumi	f755e0001a	test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157474 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-25 15:40:54 +00:00
NAKAMURA Takumi	a389a23bbb	test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157471 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-25 15:12:21 +00:00
Eli Friedman	2db0e9ebb6	Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157446 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-25 00:09:29 +00:00
David Blaikie	28a5ab2fb4	Fix for CHECK-NOT misspelling. Patch by Nicklas Bo Jensen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157421 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-24 22:08:29 +00:00
Jakob Stoklund Olesen	e3b548219f	Correctly deal with identity copies in RegisterCoalescer. Now that the coalescer keeps live intervals and machine code in sync at all times, it needs to deal with identity copies differently. When merging two virtual registers, all identity copies are removed right away. This means that other identity copies must come from somewhere else, and they are going to have a value number. Deal with such copies by merging the value numbers before erasing the copy instruction. Otherwise, we leave dangling value numbers in the live interval. This fixes PR12927. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157340 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-23 20:21:06 +00:00
Nuno Lopes	23e75da7e0	revert my previous patches that introduced an additional parameter to the objectsize intrinsic. After a lot of discussion, we realized it's not the best option for run-time bounds checking git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157255 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen	76ff741836	Only erase virtregs with no uses left. Also make sure registers aren't erased twice if the dead def mentions the register twice. This fixes PR12911. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157254 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-22 14:52:12 +00:00
Craig Topper	8ae97baef2	Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157175 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-21 06:40:16 +00:00
Peter Collingbourne	92d63ccfc7	When legalising shifts, do not pre-build a list of operands which may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve the other operands when calling UpdateNodeOperands. Fixes PR12889. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157162 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-20 18:36:15 +00:00
Jakob Stoklund Olesen	ee0d5d4398	Properly constrain register classes for sub-registers. Not all GR64 registers have sub_8bit sub-registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157150 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen	8e86929e3c	Properly constrain register classes in 2-addr. X86 has 2-addr instructions with different constraints on the tied def and use operands. One is GR32, one is GR32_NOSP. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157149 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen	7ebed91fdd	Fix 12892. Dead code elimination during coalescing could cause a virtual register to be split into connected components. The following rewriting would be confused about the already joined copies present in the code, but without a corresponding value number in the live range. Erase all joined copies instantly when joining intervals such that the MI and LiveInterval representations are always in sync. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157135 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-19 23:34:59 +00:00
Jakob Stoklund Olesen	ccce1233a2	Erase joined copies immediately. The late dead code elimination is no longer necessary. The test changes are cause by a register hint that can be either %rdi or %rax. The choice depends on the use list order, which this patch changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157131 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-19 20:54:07 +00:00
Nadav Rotem	87d35e8c71	On Haswell, perfer storing YMM registers using a single instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157129 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-19 20:30:08 +00:00
Nadav Rotem	4fc8a5de44	Add support for additional in-reg vbroadcast patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157127 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-19 19:57:37 +00:00
Craig Topper	b82b5abf78	Simplify handling of v16i8 shuffles and fix a missed optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157043 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-18 06:42:06 +00:00
Evan Cheng	ad75364815	Teach two-address pass to update the "source" map so it doesn't perform a non-profitable commute using outdated info. The test case would still fail because of poor pre-RA schedule. That will be fixed by MI scheduler. rdar://11472010 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157038 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-18 01:33:51 +00:00
Jakob Stoklund Olesen	0e5e821a69	Remove a test that was only testing for physreg joining. This is the same as the other tests: Clever tricks are required to make the arguments and return value line up in a single-instruction function. It rarely happens in real life. We have plenty other examples of this behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157030 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen	ed18a3e6b2	Remove -join-physregs from the test suite. This option has been disabled for a while, and it is going away so I can clean up the coalescer code. The tests that required physreg joining to be enabled were almost all of the form "tiny function with interference between arguments and return value". Such functions are usually inlined in the real world. The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly rare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@157027 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-17 23:44:19 +00:00
Evan Cheng	6100366c2f	Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156896 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen	4d10829e12	Fix PR12821. RAFast must add an <imp-def> operand when it is rewriting a sub-register def that isn't a read-modify-write. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156777 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-14 21:10:25 +00:00
Dan Gohman	a6063c6e29	Rename @llvm.debugger to @llvm.debugtrap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156774 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-14 18:58:10 +00:00
Hans Wennborg	12447575dc	Fix test/CodeGen/X86/tls-pie.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156612 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-11 10:19:54 +00:00
Hans Wennborg	228756c744	Implement initial-exec TLS model for 32-bit PIC x86 This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong code here (see the update to test/CodeGen/X86/tls-pie.ll). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156611 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-11 10:11:01 +00:00
Dan Gohman	d4347e1af9	Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(), but it generates int3 on x86 instead of ud2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156593 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-11 00:19:32 +00:00
Nadav Rotem	b210651654	AVX2: Add an additional broadcast idiom. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156540 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-10 12:39:13 +00:00
Nadav Rotem	5fc2187a02	Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program. Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users. Fix PR11900. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156539 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-10 12:22:05 +00:00
Nuno Lopes	30759542aa	change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept. This commit only adds the parameter. Code taking advantage of it will follow. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156473 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-09 15:52:43 +00:00
Craig Topper	189bce48c7	Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156375 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-08 06:58:15 +00:00
Chad Rosier	42726835e3	Fix a regression from r147481. This combine should only happen if there is a single use. rdar://11360370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156316 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-07 18:47:44 +00:00
Manman Ren	ed57984483	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td: def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>; rdar: 10961709 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156312 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-07 18:06:23 +00:00
Craig Topper	5f9cccc509	Add SSE4A MOVNTSS/MOVNTSD instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156281 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-07 05:36:19 +00:00
Benjamin Kramer	77c4ef8a47	Switch the select to branch transformation on by default. The primitive conservative heuristic seems to give a slight overall improvement while not regressing stuff. Make it available to wider testing. If you notice any speed regressions (or significant code size regressions) let me know! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156258 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-06 14:25:16 +00:00
Benjamin Kramer	59957500f9	CodeGenPrepare: Add a transform to turn selects into branches in some cases. This came up when a change in block placement formed a cmov and slowed down a hot loop by 50%: ucomisd (%rdi), %xmm0 cmovbel %edx, %esi cmov is a really bad choice in this context because it doesn't get branch prediction. If we emit it as a branch, an out-of-order CPU can do a better job (if the branch is predicted right) and avoid waiting for the slow load+compare instruction to finish. Of course it won't help if the branch is unpredictable, but those are really rare in practice. This patch uses a dumb conservative heuristic, it turns all cmovs that have one use and a direct memory operand into branches. cmovs usually save some code size, so we disable the transform in -Os mode. In-Order architectures are unlikely to benefit as well, those are included in the "predictableSelectIsExpensive" flag. It would be better to reuse branch probability info here, but BPI doesn't support select instructions currently. It would make sense to use the same heuristics as the if-converter pass, which does the opposite direction of this transform. Test suite shows a small improvement here and there on corei7-level machines, but the actual results depend a lot on the used microarchitecture. The transformation is currently disabled by default and available by passing the -enable-cgp-select2branch flag to the code generator. Thanks to Chandler for the initial test case to him and Evan Cheng for providing me with comments and test-suite numbers that were more stable than mine :) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156234 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-05 12:49:22 +00:00
Craig Topper	f3640d7ec1	Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156156 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-04 04:44:49 +00:00
Craig Topper	6b28d356c5	Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156059 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-03 07:12:59 +00:00
Evan Cheng	d99d68bcee	Fix two-address pass's aggressive instruction commuting heuristics. It's meant to catch cases like: %reg1024<def> = MOV r1 %reg1025<def> = MOV r0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 By commuting ADD, it let coalescer eliminate all of the copies. However, there was a bug in the heuristics where it ended up commuting the ADD in: %reg1024<def> = MOV r0 %reg1025<def> = MOV 0 %reg1026<def> = ADD %reg1024, %reg1025 r0 = MOV %reg1026 That did no benefit but rather ensure the last MOV would not be coalesced. rdar://11355268 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@156048 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-03 01:45:13 +00:00
Manman Ren	e2849851b2	Revert r155853 The commit is intended to fix rdar://10961709. But it is the root cause of PR12720. Revert it for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155992 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-02 15:24:32 +00:00
Craig Topper	a9a568a79d	Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155982 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-02 08:03:44 +00:00
Bill Wendling	95dd442041	Strip the pointer casts off of allocas so that the selection DAG can find them. PR10799 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155954 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-01 22:50:45 +00:00
Manman Ren	769ea2f93f	X86: optimization for max-like struct This patch will optimize the following cases on X86 (a > b) ? (a-b) : 0 (a >= b) ? (a-b) : 0 (b < a) ? (a-b) : 0 (b <= a) ? (a-b) : 0 FROM movl %edi, %ecx subl %esi, %ecx cmpl %edi, %esi movl $0, %eax cmovll %ecx, %eax TO xorl %eax, %eax subl %esi, %edi cmovll %eax, %edi movl %edi, %eax rdar: 10734411 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155919 91177308-0d34-0410-b5e6-96231b3b80d8	2012-05-01 17:16:15 +00:00
Manman Ren	16a76519a5	X86: optimization for -(x != 0) This patch will optimize -(x != 0) on X86 FROM cmpl $0x01,%edi sbbl %eax,%eax notl %eax TO negl %edi sbbl %eax %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155853 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 22:51:25 +00:00
Manman Ren	1701105022	test/CodeGen/X86/select.ll: remove spaces git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155840 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 18:54:27 +00:00
Derek Schuff	ddc693bd22	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. (this time, actually commit what was reviewed!) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155825 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-30 16:57:15 +00:00
Andrew Trick	2674a4acdb	Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice. This time, also fix the caller of AddGlue to properly handle incomplete chains. AddGlue had failure modes, but shamefully hid them from its caller. It's luck ran out. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155749 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-28 01:03:23 +00:00
Derek Schuff	f3db6b855e	Revert r155745 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155746 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 23:37:41 +00:00
Derek Schuff	9dc28b0722	Fix fastcc structure return with fast-isel on x86-32 On x86-32, structure return via sret lets the callee pop the hidden pointer argument off the stack, which the caller then re-pushes. However if the calling convention is fastcc, then a register is used instead, and the caller should not adjust the stack. This is implemented with a check of IsTailCallConvention X86TargetLowering::LowerCall but is now checked properly in X86FastISel::DoSelectCall. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155745 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 23:27:17 +00:00
Andrew Trick	0e47cfd5b6	Temporarily revert r155668: Fix the SD scheduler to avoid gluing. This definitely caused regression with ARM -mno-thumb. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155743 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 22:55:59 +00:00
Chad Rosier	a73b6fc511	Add x86-specific DAG combine to simplify: x == -y --> x+y == 0 x != -y --> x+y != 0 On x86, the generated code goes from negl %esi cmpl %esi, %edi je .LBB0_2 to addl %esi, %edi je .L4 This case is correctly handled for ARM with "cmn". Patch by Manman Ren. rdar://11245199 PR12545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155739 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 22:33:25 +00:00
Benjamin Kramer	17c836c4b5	X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures. * Model FPSW (the FPU status word) as a register. * Add ISel patterns for the FUCOM, FNSTSW and SAHF instructions. During Legalize/Lowering, build a node sequence to transfer the comparison result from FPSW into EFLAGS. If you're wondering about the right-shift: That's an implicit sub-register extraction (%ax -> %ah) which is handled later on by the instruction selector. Fixes PR6679. Patch by Christoph Erhardt! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155704 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 12:07:43 +00:00
Craig Topper	76c5897eae	Add mcpu to tests to prevent them from using AVX instructions on Sandy Bridge after r155618. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155696 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-27 07:11:58 +00:00
Andrew Trick	aec9240be2	Fix the SD scheduler to avoid gluing the same node twice. DAGCombine strangeness may result in multiple loads from the same offset. They both may try to glue themselves to another load. We could insist that the redundant loads glue themselves to each other, but the beter fix is to bail out from bad gluing at the time we detect it. Fixes rdar://11314175: BuildSchedUnits assert. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155668 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-26 21:48:25 +00:00
Jakob Stoklund Olesen	6962106496	Try to fix llvm-arm-linux builder with -mcpu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155589 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-25 21:22:33 +00:00
Preston Gurd	01c0dd1be3	Trivial change to make the test use -mcpu=generic so as to avoid a failure if run on an Intel Atom with post RA instruction scheduling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155587 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-25 21:04:54 +00:00
Nadav Rotem	2003e03045	Fix the testcase. We do expect two vblendw on XMMs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155477 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 19:57:38 +00:00
Nadav Rotem	34a13bb412	Add a testcase for 155440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@155475 91177308-0d34-0410-b5e6-96231b3b80d8	2012-04-24 19:45:28 +00:00

1 2 3 4 5 ...

3817 Commits