llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-09-09 15:57:28 +00:00

Author	SHA1	Message	Date
Benjamin Kramer	4eed756153	Switch spill weights from a basic loop depth estimation to BlockFrequencyInfo. The main advantages here are way better heuristics, taking into account not just loop depth but also __builtin_expect and other static heuristics and will eventually learn how to use profile info. Most of the work in this patch is pushing the MachineBlockFrequencyInfo analysis into the right places. This is good for a 5% speedup on zlib's deflate (x86_64), there were some very unfortunate spilling decisions in its hottest loop in longest_match(). Other benchmarks I tried were mostly neutral. This changes register allocation in subtle ways, update the tests for it. 2012-02-20-MachineCPBug.ll was deleted as it's very fragile and the instruction it looked for was gone already (but the FileCheck pattern picked up unrelated stuff). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184105 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-17 19:00:36 +00:00
David Blaikie	6d9dbd5526	Debug Info: Simplify Frame Index handling in DBG_VALUE Machine Instructions Rather than using the full power of target-specific addressing modes in DBG_VALUEs with Frame Indicies, simply use Frame Index + Offset. This reduces the complexity of debug info handling down to two representations of values (reg+offset and frame index+offset) rather than three or four. Ideally we could ensure that frame indicies had been eliminated by the time we reached an assembly or dwarf generation, but I haven't spent the time to figure out where the FIs are leaking through into that & whether there's a good place to convert them. Some FI+offset=>reg+offset conversion is done (see PrologEpilogInserter, for example) which is necessary for some SelectionDAG assumptions about registers, I believe, but it might be possible to make this a more thorough conversion & ensure there are no remaining FIs no matter how instruction selection is performed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184066 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-16 20:34:15 +00:00
David Blaikie	4bb23594f3	DebugInfo: follow up to 184045 to constrain the tests further to ensure they don't contain +0 offsets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184046 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-15 16:02:44 +00:00
David Blaikie	f14b44c71b	DebugInfo: print DBG_VALUE MachineInstrs with [] for deref and drop the offset when it's zero git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184045 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-15 15:52:58 +00:00
Andrew Trick	b86a0cdb67	Machine Model: Add MicroOpBufferSize and resource BufferSize. Replace the ill-defined MinLatency and ILPWindow properties with with straightforward buffer sizes: MCSchedMode::MicroOpBufferSize MCProcResourceDesc::BufferSize These can be used to more precisely model instruction execution if desired. Disabled some misched tests temporarily. They'll be reenabled in a few commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184032 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-15 04:49:57 +00:00
David Blaikie	702ff96ff3	Debug Info: Don't print the display name and colon prefix for DEBUG_VALUE comments if the display name is empty git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184026 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-15 00:33:47 +00:00
Tom Stellard	5aee09da12	R600: Add SI load support for v[24]i32 and store for v2i32 Also add a seperate vector lit test file, since r600 doesn't seem to handle v2i32 load/store yet, but we can test both for SI. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184021 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-15 00:09:31 +00:00
Tom Stellard	d6055262d2	R600: Use correct encoding for Vertex Fetch instructions on Cayman Reviewed-by: Vincent Lejeune<vljn at ovi.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184016 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 22:12:30 +00:00
Tom Stellard	4efccd0fb1	R600: Use EXPORT_RAT_INST_STORE_DWORD for stores on Cayman We were using RAT_INST_STORE_RAW, which seemed to work, but the docs say this instruction doesn't exist for Cayman, so it's probably safer to use a documented instruction instead. Reviewed-by: Vincent Lejeune<vljn at ovi.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184015 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 22:12:24 +00:00
Tim Northover	89dbe97442	Mark rematerialized super/sub registers as dead. When we're rematerializing into a not-quite-right register we already add the real definition as an imp-def, but we should also be marking the "official" register as dead, since nothing else is going to use it as a result of this remat. Not doing this can affect pressure tracking. rdar://problem/14158833 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@184002 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 20:22:21 +00:00
Stephen Lin	38103d1012	SelectionDAG: Fix incorrect condition checks in some cases of folding FADD/FMUL combinations; also improve accuracy of comments git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183993 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 18:17:35 +00:00
Derek Schuff	8a0d41e1a6	Make PrologEpilogInserter save/restore all callee saved registers in functions which call __builtin_unwind_init() __builtin_unwind_init() is an undocumented gcc intrinsic which has this effect, and is used in libgcc_eh. Goes part of the way toward fixing PR8541. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183984 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 16:15:29 +00:00
Benjamin Kramer	d25ec760cb	X86: cvtpi2ps is just an SSE instruction with MMX operands. It has no AVX equivalent. Give it the right register format so we can also emit it when AVX is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183971 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 09:31:41 +00:00
JF Bastien	fe532ad6d6	Enable FastISel on ARM for Linux and NaCl, not MCJIT This is a resubmit of r182877, which was reverted because it broken MCJIT tests on ARM. The patch leaves MCJIT on ARM as it was before: only enabled for iOS. I've CC'ed people from the original review and revert. FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl, but not MCJIT. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of lnt test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. All the tests also pass on x86 make check-all. I also re-ran the check-all tests that failed on ARM, and they all seem to pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183966 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-14 02:49:43 +00:00
Bill Schmidt	11729224bf	[PowerPC] Disable fast-isel for existing -O0 tests for PowerPC. This is a preliminary patch for fast instruction selection on PowerPC. Code generation can differ between DAG isel and fast isel. Existing tests that specify -O0 were written to expect DAG isel. Make this explicit by adding -fast-isel=false to the tests. In some cases specifying -fast-isel=false produces different code even when there isn't a fast instruction selector specified. This is because TM.Options.EnableFastISel = 1 at -O0 whether or not a FastISel object exists. Thus disabling fast isel can actually produce less conservative code. Because of this, some of the expected code generation in the -O0 tests needs to be adjusted. In particular, handling of function arguments is less conservative with -fast-isel=false (see isOnlyUsedInEntryBlock() in SelectionDAGBuilder.cpp). This results in fewer stack accesses and, in some cases, reduced stack size as uselessly loaded values are no longer stored back to spill locations in the stack. No functional change with this patch; test case adjustments only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183939 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-13 20:23:34 +00:00
Akira Hatanaka	45137f954f	[mips] Add an IR transformation pass that optimizes calls to sqrt. The pass emits a call to sqrt that has attribute "read-none". This call will be converted to an ISD::FSQRT node during DAG construction, which will turn into a mips native sqrt instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183802 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-11 22:21:44 +00:00
Tim Northover	e5609f3732	X86: Stop LEA64_32r doing unspeakable things to its arguments. Previously LEA64_32r went through virtually the entire backend thinking it was using 32-bit registers until its blissful illusions were cruelly snatched away by MCInstLower and 64-bit equivalents were substituted at the last minute. This patch makes it behave normally, and take 64-bit registers as sources all the way through. Previous uses (for 32-bit arithmetic) are accommodated via SUBREG_TO_REG instructions which make the types and classes agree properly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183693 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-10 20:43:49 +00:00
Justin Holewinski	7c32502a7f	[NVPTX] Remove old CONST_NOT_GEN address space that is not being used anymore and causes constants to be emitted in the global address space git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183652 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-10 13:29:47 +00:00
JF Bastien	366d94e16c	Add test for ARM FastISel load/store register classes r183624 fixed an issue that was tested indirectly. Test it directly with this new test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183634 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-10 00:35:57 +00:00
Reed Kotler	b0ee97a366	Fix a regression I introduced when I expanded the complex pseudos in the Mips16 port. A few of the psuedos could either take signed or unsigned arguments and I did not distinguish the case and improperly rejected some valid cases that the assembler had previously accepted when they were pure pseudos that expanded as assembly instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183633 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-09 23:23:46 +00:00
Logan Chien	c8ecf53082	Refine the ARM EHABI test cases. Since we have ARM unwind directive parser and assembler, we can check the correctness in two stages: 1. From LLVM assembly (.ll) to ARM assembly (.s) 2. From ARM assembly (.s) to ELF object file (.o) We already have several ".s to .o" test cases. This CL adds some ".ll to .s" test cases and removes the redundant ".ll to .o" test cases. New test cases to check ".ll to .s" code generator: - ehabi.ll: Check the correctness of the generated unwind directives. - section-name.ll: Check the section name of functions. Removed test cases: - ehabi-mc-cantunwind.ll (Covered by ehabi-cantunwind.ll, and eh-directive-cantunwind.s) - ehabi-mc-compact-pr0.ll (Covered by ehabi.ll, eh-compact-pr0.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc-compact-pr1.ll (Covered by ehabi.ll, eh-compact-pr1.s, eh-directive-save.s, and eh-directive-setfp.s) - ehabi-mc.ll (Covered by ehabi.ll, and eh-directive-integrated-test.s) - ehabi-mc-section-group.ll (Covered by section-name.ll, and eh-directive-section-comdat.s) - ehabi-mc-section.ll (Covered by section-name.ll, and eh-directive-section.s) - ehabi-mc-sh_link.ll (Covered by eh-directive-text-section.s, and eh-directive-section.s) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183628 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-09 12:36:57 +00:00
Logan Chien	18cba562c8	Fix ARM unwind opcode assembler in several cases. Changes to ARM unwind opcode assembler: * Fix multiple .save or .vsave directives. Besides, the order is preserved now. * For the directives which will generate multiple opcodes, such as ".save {r0-r11}", the order of the unwind opcode is fixed now, i.e. the registers with less encoding value are popped first. * Fix the $sp offset calculation. Now, we can use the .setfp, .pad, .save, and .vsave directives at any order. Changes to test cases: * Add test cases to check the order of multiple opcodes for the .save directive. * Fix the incorrect $sp offset in the test case. The stack pointer offset specified in the test case was incorrect. (Changed test cases: ehabi-mc-section.ll and ehabi-mc.ll) * The opcode to restore $sp are slightly reordered. The behavior are not changed, and the new output is same as the output of GNU as. (Changed test cases: eh-directive-pad.s and eh-directive-setfp.s) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183627 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-09 12:22:30 +00:00
Elena Demikhovsky	40e071c1eb	Removed PackedDouble domain from scalar instructions. Added more formats for the scalar stuff. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183626 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-09 07:37:10 +00:00
Venkatraman Govindaraju	1799921672	[Sparc] Delete FPMover Pass and remove Fp* Pseudo-instructions from Sparc backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183613 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-08 15:32:59 +00:00
Quentin Colombet	c0cc28301a	Reapply r183552. This time, use a standard type for the option to avoid template instantiation issue with non-standard type. Add a backend option to warn on a given stack size limit. Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183595 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-08 00:07:54 +00:00
Vincent Lejeune	b01bdf87ff	R600: Anti dep better handled in tex clause git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183592 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 23:30:26 +00:00
Jakob Stoklund Olesen	7de1d327f1	Add missing zextloadi1 to i64 patterns. PR16721. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183587 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 22:55:05 +00:00
Hal Finkel	40be73bed7	Disallow i64 div/rem in PPC32 counter loops On PPC32, [su]div,rem on i64 types are transformed into runtime library function calls. As a result, they are not allowed in counter-based loops (the counter-loops verification pass caught this error; this change fixes PR16169). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183581 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 22:16:19 +00:00
Quentin Colombet	95f24fbe4c	Revert commits related to stack warning. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183579 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 22:14:50 +00:00
Quentin Colombet	2e10e8e378	Explicit triple in warn stack size test cases to not depend on OS. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183574 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 21:09:42 +00:00
Tom Stellard	df74b86e1e	R600: Fix calculation of stack offset in AMDGPUFrameLowering We weren't computing structure size correctly and we were relying on the original alloca instruction to compute the offset, which isn't always reliable. Reviewed-by: Vincent Lejeune <vljn@ovi.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183568 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 20:52:05 +00:00
Tom Stellard	ce961477be	R600: Fix the fetch limits for R600 generation GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64257 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183560 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 20:28:55 +00:00
Quentin Colombet	9a6b9bffa5	Add a backend option to warn on a given stack size limit. Option: -mllvm -warn-stack-size=<limit> Output (if limit is exceeded): warning: Stack size limit exceeded (<actual size>) in <functionName>. The longer term plan is to hook that to a clang warning. PR:4072 <rdar://problem/13987214> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183552 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 20:18:12 +00:00
JF Bastien	8fc760cbe8	ARM FastISel integer sext/zext improvements My recent ARM FastISel patch exposed this bug: http://llvm.org/bugs/show_bug.cgi?id=16178 The root cause is that it can't select integer sext/zext pre-ARMv6 and asserts out. The current integer sext/zext code doesn't handle other cases gracefully either, so this patch makes it handle all sext and zext from i1/i8/i16 to i8/i16/i32, with and without ARMv6, both in Thumb and ARM mode. This should fix the bug as well as make FastISel faster because it bails to SelectionDAG less often. See fastisel-ext.patch for this. fastisel-ext-tests.patch changes current tests to always use reg-imm AND for 8-bit zext instead of UXTB. This simplifies code since it is supported on ARMv4t and later, and at least on A15 both should perform exactly the same (both have exec 1 uop 1, type I). 2013-05-31-char-shift-crash.ll is a bitcode version of the above bug 16178 repro. fast-isel-ext.ll tests all sext/zext combinations that ARM FastISel should now handle. Note that my ARM FastISel enabling patch was reverted due to a separate failure when dealing with MCJIT, I'll fix this second failure and then turn FastISel on again for non-iOS ARM targets. I've tested "make check-all" on my x86 box, and "lnt test-suite" on A15 hardware. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183551 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 20:10:37 +00:00
Quentin Colombet	fcca6c690c	Teach AsmPrinter how to print odd constants. Fix an assertion when the compiler encounters big constants whose bit width is not a multiple of 64-bits. Although clang would never generate something like this, the backend should be able to handle any legal IR. <rdar://problem/13363576> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183544 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 18:36:03 +00:00
Roman Divacky	6ca5fd3f30	Fix a typo in asm string of BP* family of instructions. With this fix I am able to compile/assemble/link/run /bin/echo from FreeBSD. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183537 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 17:46:57 +00:00
Rafael Espindola	62ed8d3e35	Support OpenBSD's native frame protection conventions. OpenBSD's stack smashing protection differs slightly from other platforms: 1. The smash handler function is "__stack_smash_handler(const char *funcname)" instead of "__stack_chk_fail(void)". 2. There's a hidden "long __guard_local" object that gets linked into each executable and DSO. Patch by Matthew Dempsky. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183533 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 16:35:57 +00:00
Venkatraman Govindaraju	01021a8b93	[Sparc]: Use cmp instruction instead of subcc to compare integers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183463 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-07 00:03:36 +00:00
Vincent Lejeune	f3d6e32c09	R600: Add a pass that merge Vector Register Previously commited @183279 but tests were failing, reverted @183286 It was broken because @183336 was missing, now it's there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183343 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 21:38:04 +00:00
Vincent Lejeune	512119770e	R600: Schedule copy from phys register at beginning of block It allows regalloc pass to remove them by trivially assigning associated reg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183336 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 20:27:35 +00:00
Akira Hatanaka	8270e68c56	[mips] brcond + setgt/setugt instruction selection patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183334 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 19:49:55 +00:00
Michael Liao	9a508ef64a	[PATCH] Fix VGATHER* operand constraints Add earlyclobber constaints to prevent input register being allocated as the output register because, according to Intel spec [1], "If any pair of the index, mask, or destination registers are the same, this instruction results a UD fault." --- [1] http://software.intel.com/sites/default/files/319433-014.pdf git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183327 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 18:12:26 +00:00
Tom Stellard	ad7ecc65b1	R600: Make sure to schedule AR register uses and defs in the same clause Reviewed-by: vljn at ovi.com git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183294 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 03:43:06 +00:00
Rafael Espindola	6afb65c2b7	Revert "R600: Add a pass that merge Vector Register" This reverts commit r183279. CodeGen/R600/texture-input-merge.ll was failing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183286 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-05 01:48:30 +00:00
Vincent Lejeune	bbbdba891b	R600: Add a pass that merge Vector Register git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183279 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 23:17:26 +00:00
Vincent Lejeune	e67a4afb5d	R600: Const/Neg/Abs can be folded to dot4 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183278 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 23:17:15 +00:00
Evan Cheng	00ed010d9e	Cortex-R5 can issue Thumb2 integer division instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183275 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 22:52:09 +00:00
David Majnemer	35e7751af4	ARM: Fix crash in ARM backend inside of ARMConstantIslandPass The ARM backend did not expect LDRBi12 to hold a constant pool operand. Allow for LLVM to deal with the instruction similar to how it deals with LDRi12. This fixes PR16215. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183238 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 17:46:15 +00:00
Vincent Lejeune	98017a015b	R600: Swizzle texture/export instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183229 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 15:04:53 +00:00
Vincent Lejeune	9328438329	R600: Add a test for r183108 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183228 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-04 15:03:35 +00:00
Tom Stellard	e5fcc0dee4	R600/SI: Add support for work item and work group intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183138 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 17:40:18 +00:00
Tom Stellard	e7397ee81a	R600/SI: Add a calling convention for compute shaders git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183137 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 17:40:11 +00:00
Tom Stellard	e86f9d70ca	R600/SI: Custom lower i64 sign_extend git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183136 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 17:40:03 +00:00
Tom Stellard	132183510f	R600/SI: Add support for global loads git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183131 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 17:39:43 +00:00
Vincent Lejeune	96fe0be43b	R600: use capital letter for PV channel git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183107 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 15:44:35 +00:00
Venkatraman Govindaraju	e7cbb792c9	Sparc: Add support for indirect branch and blockaddress in Sparc backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183094 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 05:58:33 +00:00
Venkatraman Govindaraju	85cc972a06	Sparc: When storing 0, use %g0 directly in the store instruction instead of using two instructions (sethi and store). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183090 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-03 00:21:54 +00:00
Venkatraman Govindaraju	65ca7aa57d	Sparc: Combine add/or/sethi instruction with restore if possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183088 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-02 21:48:17 +00:00
Venkatraman Govindaraju	dd48226b15	Sparc: Perform leaf procedure optimization by default git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183083 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-02 02:24:27 +00:00
Venkatraman Govindaraju	a0b34d6c4a	Sparc: Mark functions calling llvm.vastart and llvm.returnaddress intrinsics as non-leaf functions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183079 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-01 20:42:48 +00:00
Tim Northover	3ba14fab1b	Revert r183069: "TMP: LEA64_32r fixing" Very sorry, it was committed from the wrong branch by mistake. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183070 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-01 10:23:46 +00:00
Tim Northover	4d3ace4da0	TMP: LEA64_32r fixing git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183069 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-01 10:21:54 +00:00
Tim Northover	85c622d6b6	X86: change MOV64ri64i32 into MOV32ri64 The MOV64ri64i32 instruction required hacky MCInst lowering because it was allocated as setting a GR64, but the eventual instruction ("movl") only set a GR32. This converts it into a so-called "MOV32ri64" which still accepts a (appropriate) 64-bit immediate but defines a GR32. This is then converted to the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy. This fixes a typo in the opcode field of the original patch, which should make the legact JIT work again (& adds test for that problem). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183068 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-01 09:55:14 +00:00
Venkatraman Govindaraju	72ad17c48c	[Sparc] Generate correct code for leaf functions with stack objects git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183067 91177308-0d34-0410-b5e6-96231b3b80d8	2013-06-01 04:51:18 +00:00
Eric Christopher	34431085de	Temporarily Revert "X86: change MOV64ri64i32 into MOV32ri64" as it seems to have caused PR16192 and other JIT related failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183059 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 23:30:45 +00:00
Quentin Colombet	5b00f4edcb	Modify how the formulae are rated in Loop Strength Reduce. Namely, check if the target allows to fold more that one register in the addressing mode and if yes, adjust the cost accordingly. Prior to this commit, reg1 + scale * reg2 accesses were artificially preferred to reg1 + reg2 accesses. Indeed, the cost model wrongly assumed that reg1 + reg2 needs a temporary register for the computation, whereas it was correctly estimated for reg1 + scale * reg2. <rdar://problem/13973908> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183021 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 17:20:29 +00:00
Richard Sandiford	b6606e46ab	[SystemZ] Don't use LOAD and STORE REVERSED for volatile accesses Unlike most -- hopefully "all other", but I'm still checking -- memory instructions we support, LOAD REVERSED and STORE REVERSED may access the memory location several times. This means that they are not suitable for volatile loads and stores. This patch is a prerequisite for better atomic load and store support. The same principle applies there: almost all memory instructions we support are inherently atomic ("block concurrent"), but LOAD REVERSED and STORE REVERSED are exceptions. Other instructions continue to allow volatile operands. I will add positive "allows volatile" tests at the same time as the "allows atomic load or store" tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@183002 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 13:25:22 +00:00
Justin Holewinski	5443e7d790	[NVPTX] Re-enable support for virtual registers in the final output Now that 3.3 is branched, we are re-enabling virtual registers to help iron out bugs before the next release. Some of the post-RA passes do not play well with virtual registers, so we disable them for now. The needed functionality of the PrologEpilogInserter pass is copied to a new backend-specific NVPTXPrologEpilog pass. The test for this commit is not breaking the existing tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182998 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 12:14:49 +00:00
Tim Northover	43887bf3e6	X86: change MOV64ri64i32 into MOV32ri64 The MOV64ri64i32 instruction required hacky MCInst lowering because it was allocated as setting a GR64, but the eventual instruction ("movl") only set a GR32. This converts it into a so-called "MOV32ri64" which still accepts a (appropriate) 64-bit immediate but defines a GR32. This is then converted to the full GR64 by a SUBREG_TO_REG operation, thus keeping everyone happy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182991 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 09:57:13 +00:00
Akira Hatanaka	affed7e11d	[mips] Big-endian code generation for atomic instructions. Patch by Jyun-Yan You. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182984 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-31 03:25:44 +00:00
Rafael Espindola	9e3e730417	Revert r182937 and r182877. r182877 broke MCJIT tests on ARM and r182937 was working around another failure by r182877. This should make the ARM bots green. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182960 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 20:37:52 +00:00
Benjamin Kramer	3a36cb0805	Force a triple so we don't get bitten by windows' different regalloc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182935 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 15:39:35 +00:00
Benjamin Kramer	051ed0ae2d	Force fragile test to the atom scheduler model. The pattern the test originally checked for doesn't occur on other -mcpu settings. On atom it's still there though slightly differently scheduled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182933 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 15:22:28 +00:00
Tim Northover	5b1552548a	X86: allow registers 8-15 in test This test was failing on some hosts when an unexpected register was used for a variable. This just extends the regexp to allow the new x86-64 registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182929 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 13:56:32 +00:00
Tim Northover	15983b80a0	X86: use sub-register sequences for MOV*r0 operations Instead of having a bunch of separate MOV8r0, MOV16r0, ... pseudo-instructions, it's better to use a single MOV32r0 (which will expand to "xorl %reg, %reg") and obtain other sizes with EXTRACT_SUBREG and SUBREG_TO_REG. The encoding is smaller and partial register updates can sometimes be avoided. Until recently, this sequence was a barrier to rematerialization though. That should now be fixed so it's an appropriate time to make the change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182928 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 13:19:42 +00:00
Justin Holewinski	d5c52f1d76	[NVPTX] Fix case where a sext load of an i1 type may produce an ld.u1 instead of an ld.u8. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182924 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 12:22:39 +00:00
Richard Sandiford	14a926f13b	[SystemZ] Enable unaligned accesses The code to distinguish between unaligned and aligned addresses was already there, so this is mostly just a switch-on-and-test process. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182920 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 09:45:42 +00:00
Rafael Espindola	7486d92a6c	Change how we iterate over relocations on ELF. For COFF and MachO, sections semantically have relocations that apply to them. That is not the case on ELF. In relocatable objects (.o), a section with relocations in ELF has offsets to another section where the relocations should be applied. In dynamic objects and executables, relocations don't have an offset, they have a virtual address. The section sh_info may or may not point to another section, but that is not actually used for resolving the relocations. This patch exposes that in the ObjectFile API. It has the following advantages: * Most (all?) clients can handle this more efficiently. They will normally walk all relocations, so doing an effort to iterate in a particular order doesn't save time. * llvm-readobj now prints relocations in the same way the native readelf does. * probably most important, relocations that don't point to any section are now visible. This is the case of relocations in the rela.dyn section. See the updated relocation-executable.test for example. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182908 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 03:05:14 +00:00
Bill Wendling	8a65e11c9f	This testcase tests command line attributes which we don't yet support. In fact, we're probably going to support these flags in completely different ways. So this test is no longer valid. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182899 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-30 00:32:04 +00:00
Andrew Trick	6e0b2a0cb0	Order CALLSEQ_START and CALLSEQ_END nodes. Fixes PR16146: gdb.base__call-ar-st.exp fails after pre-RA-sched=source fixes. Patch by Xiaoyi Guo! This also fixes an unsupported dbg.value test case. Codegen was previously incorrect but the test was passing by luck. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182885 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 22:03:55 +00:00
JF Bastien	f567a6d39b	Enable FastISel on ARM for Linux and NaCl FastISel was only enabled for iOS ARM and Thumb2, this patch enables it for ARM (not Thumb2) on Linux and NaCl. Thumb2 support needs a bit more work, mainly around register class restrictions. The patch punts to SelectionDAG when doing TLS relocation on non-Darwin targets. I will fix this and other FastISel-to-SelectionDAG failures in a separate patch. The patch also forces FastISel to retain frame pointers: iOS always keeps them for backtracking (so emitted code won't change because of this), but Linux was getting much worse code that was incorrect when using big frames (such as test-suite's lencod). I'll also fix this in a later patch, it will probably require a peephole so that FastISel doesn't rematerialize frame pointers back-to-back. The test changes are straightforward, similar to: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130513/174279.html They also add a vararg test that got dropped in that change. I ran all of test-suite on A15 hardware with --optimize-option=-O0 and all the tests pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182877 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 20:38:10 +00:00
Tim Northover	aae0fa998a	Teach ReMaterialization to be more cunning about subregisters This allows rematerialization during register coalescing to handle more cases involving operations like SUBREG_TO_REG which might need to be rematerialized using sub-register indices. For example, code like: v1(GPR64):sub_32 = MOVZ something v2(GPR64) = COPY v1(GPR64) should be convertable to: v2(GPR64):sub_32 = MOVZ something but previously we just gave up in places like this git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182872 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 19:32:06 +00:00
Richard Sandiford	598377060d	[SystemZ] Two tests missing from previous commit git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182847 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 11:59:26 +00:00
Richard Sandiford	2d664abbfc	[SystemZ] Immediate compare-and-branch support This patch adds support for the CIJ and CGIJ instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182846 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 11:58:52 +00:00
Venkatraman Govindaraju	5300869256	[Sparc] Add support for leaf functions in sparc backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182822 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-29 04:46:31 +00:00
Richard Sandiford	d50bcb2162	[SystemZ] Register compare-and-branch support This patch adds support for the CRJ and CGRJ instructions. Support for the immediate forms will be a separate patch. The architecture has a large number of comparison instructions. I think it's generally better to concentrate on using the "best" comparison instruction first and foremost, then only use something like CRJ if CR really was the natual choice of comparison instruction. The patch therefore opportunistically converts separate CR and BRC instructions into a single CRJ while emitting instructions in ISelLowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182764 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-28 10:41:11 +00:00
Preston Gurd	b704d23062	Convert sqrt functions into sqrt instructions when -ffast-math is in effect. When -ffast-math is in effect (on Linux, at least), clang defines __FINITE_MATH_ONLY__ > 0 when including <math.h>. This causes the preprocessor to include <bits/math-finite.h>, which renames the sqrt functions. For instance, "sqrt" is renamed as "__sqrt_finite". This patch adds the 3 new names in such a way that they will be treated as equivalent to their respective original names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182739 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-27 15:44:35 +00:00
Rafael Espindola	f594e41ae9	Add a cpu to try to bring back the atom bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182734 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-27 13:22:52 +00:00
Hal Finkel	1907cad7c8	Prefer to duplicate PPC Altivec loads when expanding unaligned loads When expanding unaligned Altivec loads, we use the decremented offset trick to prevent page faults. Unfortunately, if we have a sequence of consecutive unaligned loads, this leads to suboptimal code generation because the 'extra' load from the first unaligned load can be combined with the base load from the second (but only if the decremented offset trick is not used for the first). Search up and down the chain, through loads and token factors, looking for consecutive loads, and if one is found, don't use the offset reduction trick. These duplicate loads are later combined to yield the desired sequence (in the future, we might want a more-powerful chain search, but that will require some changes to allow the combiner routines to access the AA object). This should complete the initial implementation of the optimized unaligned Altivec load expansion. There is some refactoring that should be done, but that will happen when the unaligned store expansion is added. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182719 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-26 18:08:30 +00:00
Andrew Trick	9edb37feb5	Fix PR16143: Insert DEBUG_VALUE before terminator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182717 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-26 08:58:50 +00:00
Hal Finkel	5a0e60425f	PPC: Combine duplicate (offset) lvsl Altivec intrinsics The lvsl permutation control instruction is a function only of the alignment of the pointer operand (relative to the 16-byte natural alignment of Altivec vectors). As a result, multiple lvsl intrinsics where the operands differ by a multiple of 16 can be combined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182708 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-25 04:05:05 +00:00
Andrew Trick	81349a7435	Track IR ordering of SelectionDAG nodes 4/4. Unit test cases for -pre-RA-sched=source. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182706 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-25 03:26:51 +00:00
Andrew Trick	dd0fb018a7	Track IR ordering of SelectionDAG nodes 3/4. Remove the old IR ordering mechanism and switch to new one. Fix unit test failures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182704 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-25 03:08:10 +00:00
Hal Finkel	80d10ded8c	PPC: Initial support for permutation-based unaligned Altivec loads Altivec only directly supports aligned loads, but the loads have a strange property: If given an unaligned address, they truncate the address to the next lower aligned address, and load from there. This property, along with an extra load and some special-purpose permutation-control instructions that generate the appropriate permutations from the original unaligned address, allow efficient lowering of aligned loads. This code uses the trick explained in the Apple Velocity Engine optimization overview document to prevent the needed extra load from possibly causing a page fault if the original address happens to be aligned. As noted in the FIXMEs, there are several additional optimizations that can be performed to reduce the cost of these loads even more. These will be implemented in future commits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182691 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-24 23:00:14 +00:00
Diego Novillo	77226a03dc	Add a new function attribute 'cold' to functions. Other than recognizing the attribute, the patch does little else. It changes the branch probability analyzer so that edges into blocks postdominated by a cold function are given low weight. Added analysis and code generation tests. Added documentation for the new attribute. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182638 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-24 12:26:52 +00:00
Tim Northover	5a02fc4b5f	ARM: implement @llvm.readcyclecounter intrinsic This implements the @llvm.readcyclecounter intrinsic as the specific MRC instruction specified in the ARM manuals for CPUs with the Power Management extensions. Older CPUs had slightly different methods which may also have to be implemented eventually, but this should cover all v7 cases. rdar://problem/13939186 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182603 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-23 19:11:20 +00:00
Tom Stellard	d078070f6a	R600: Fix R600ControlFlowFinalizer not considering VTX_READ 128 bit dst reg Patch by: Vincent Lejeune https://bugs.freedesktop.org/show_bug.cgi?id=64877 NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182600 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-23 18:26:42 +00:00
Jakob Stoklund Olesen	e0b59774cb	Fix PR16110: Handle DBG_VALUE in ConnectedVNInfoEqClasses::Distribute(). Now that the LiveDebugVariables pass is running after register coalescing, the ConnectedVNInfoEqClasses class needs to deal with DBG_VALUE instructions. This only comes up when rematerialization during coalescing causes the remaining live range of a virtual register to separate into two connected components. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182592 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-23 17:02:23 +00:00
Nick Lewycky	fa03ff99b2	Add missing test from r175092. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182564 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-23 07:46:13 +00:00
Nadav Rotem	23d1d5eb56	X86: Fix a bug in EltsFromConsecutiveLoads. We can't generate new loads without chains. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182507 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-22 19:28:41 +00:00
Benjamin Kramer	60ef6c9295	X86: When expanding PCMPGTQ to PCMPGTD we always want to compare the lower halves as unsigned. Take #2 on fixing PR15977. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182486 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-22 17:01:12 +00:00
David Majnemer	3b4b5367da	X86: Remove test instructions proceeding shift by immediate instructions Allow LLVM to take advantage of shift instructions that set the ZF flag, making instructions that test the destination superfluous. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182454 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-22 08:13:02 +00:00
Akira Hatanaka	2591b5c6c3	[mips] Rename option to make it compatible with gcc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182397 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 17:17:59 +00:00
Akira Hatanaka	1d4d32398d	[mips] Add instruction selection patterns for blez and bgez. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182396 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 17:13:47 +00:00
Justin Holewinski	b9c26dcb24	[NVPTX] Add @llvm.nvvm.sqrt.f() intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182394 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 16:51:30 +00:00
Justin Holewinski	c2b7f5fa51	Drop @llvm.annotation and @llvm.ptr.annotation intrinsics during codegen. The intrinsic calls are dropped, but the annotated value is propagated. Fixes PR 15253 Original patch by Zeng Bin! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182387 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 14:37:16 +00:00
Benjamin Kramer	f106d8bad6	X86: When emulating unsigned PCMPGTQ with PCMPGTD, fix the sign bit for the smaller type. Otherwise we'll get a mix of signed and unsigned compares. Fixes PR15977. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182364 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 09:58:54 +00:00
Richard Sandiford	af2a1bebfc	[SystemZ] Tighten branch tests After r182274, the branches in these tests must always be short. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182358 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 08:53:17 +00:00
Benjamin Kramer	f19b8b018b	DAGCombine: Avoid an edge case where it tried to create an i0 type for (x & 0) == 0. Fixes PR16083. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182357 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 08:51:09 +00:00
Reed Kotler	49d44a080a	Add checks that the proper predeined stubs are being called to the test case. These were accidentally omitted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182347 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 01:27:36 +00:00
Reed Kotler	bf00bf9ad2	Add some additional functions to the list of helper functions for pic calls. These need to be there so we don't try and use helper functions when we call those. As part of this, make sure that we properly exclude helper functions in pic mode when indirect calls are involved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182343 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-21 00:50:30 +00:00
Akira Hatanaka	1aeb13bd9c	[mips] Add (setne $lhs, 0) instruction selection pattern. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182307 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 18:18:07 +00:00
Akira Hatanaka	f894199a14	[mips] Trap on integer division by zero. By default, a teq instruction is inserted after integer divide. No divide-by-zero checks are performed if option "-mnocheck-zero-division" is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182306 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 18:07:43 +00:00
Justin Holewinski	9b39c726a0	[NVPTX] Fix mis-use of CurrentFnSym in NVPTXAsmPrinter. This was causing a symbol name error in the output PTX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182298 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 16:42:18 +00:00
Tom Stellard	4f8d90df45	R600: Fix rotr.ll on non-asserts builds The -debug-only option is only available on asserts builds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182291 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 15:28:48 +00:00
Tom Stellard	0bbfc9313c	R600/SI: Add pattern for rotr Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182286 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 15:02:24 +00:00
Tom Stellard	ba534c2143	R600: Swap the legality of rotl and rotr The hardware supports rotr and not rotl. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182285 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 15:02:19 +00:00
Tom Stellard	a9d5d0b346	R600/SI: Add patterns for 64-bit shift operations Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182284 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 15:02:12 +00:00
Richard Sandiford	44b486ed78	[SystemZ] Add long branch pass Before this change, the SystemZ backend would use BRCL for all branches and only consider shortening them to BRC when generating an object file. E.g. a branch on equal would use the JGE alias of BRCL in assembly output, but might be shortened to the JE alias of BRC in ELF output. This was a useful first step, but it had two problems: (1) The z assembler isn't traditionally supposed to perform branch shortening or branch relaxation. We followed this rule by not relaxing branches in assembler input, but that meant that generating assembly code and then assembling it would not produce the same result as going directly to object code; the former would give long branches everywhere, whereas the latter would use short branches where possible. (2) Other useful branches, like COMPARE AND BRANCH, do not have long forms. We would need to do something else before supporting them. (Although COMPARE AND BRANCH does not change the condition codes, the plan is to model COMPARE AND BRANCH as a CC-clobbering instruction during codegen, so that we can safely lower it to a separate compare and long branch where necessary. This is not a valid transformation for the assembler proper to make.) This patch therefore moves branch relaxation to a pre-emit pass. For now, calls are still shortened from BRASL to BRAS by the assembler, although this too is not really the traditional behaviour. The first test takes about 1.5s to run, and there are likely to be more tests in this vein once further branch types are added. The feeling on IRC was that 1.5s is a bit much for a single test, so I've restricted it to SystemZ hosts for now. The patch exposes (and fixes) some typos in the main CodeGen/SystemZ tests. A later patch will remove the {{g}}s from that directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182274 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 14:23:08 +00:00
Justin Holewinski	7536ecf291	[NVPTX] Add GenericToNVVM IR converter to better handle idiomatic LLVM IR inputs This converter currently only handles global variables in address space 0. For these variables, they are promoted to address space 1 (global memory), and all uses are updated to point to the result of a cvta.global instruction on the new variable. The motivation for this is address space 0 global variables are illegal since we cannot declare variables in the generic address space. Instead, we place the variables in address space 1 and explicitly convert the pointer to address space 0. This is primarily intended to help new users who expect to be able to place global variables in the default address space. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182254 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 12:13:32 +00:00
Justin Holewinski	55fdf53629	[NVPTX] Fix i1 kernel parameters and global variables. ABI rules say we need to use .u8 for i1 parameters for kernels. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182253 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 12:13:28 +00:00
Stepan Dyatkovskiy	083bc97344	PR15868 fix. Introduction: In case when stack alignment is 8 and GPRs parameter part size is not N8: we add padding to GPRs part, so part's last byte must be recovered at address K8-1. We need to do it, since remained (stack) part of parameter starts from address K8, and we need to "attach" "GPRs head" without gaps to it: Stack: \|---- 8 bytes block ----\| \|---- 8 bytes block ----\| \|---- 8 bytes... [ [padding] [GPRs head] ] [ ------ Tail passed via stack ------ ... FIX: Note, once we added padding we need to correct all* Arg offsets that are going after padded one. That's why we need this fix: Arg offsets were never corrected before this patch. See new test-cases included in patch. We also don't need to insert padding for byval parameters that are stored in GPRs only. We need pad only last byval parameter and only in case it outsides GPRs and stack alignment = 8. Though, stack area, allocated for recovered byval params, must satisfy "Size mod 8 = 0" restriction. This patch reduces stack usage for some cases: We can reduce ArgRegsSaveArea since inner N*4 bytes sized byval params my be "packed" with alignment 4 in some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182237 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 08:01:34 +00:00
Jakob Stoklund Olesen	89f530ebbf	Also expand 64-bit bitcasts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182229 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 01:01:43 +00:00
Jakob Stoklund Olesen	5e5b78ca36	Implement spill and fill of I64Regs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182228 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 00:53:25 +00:00
Jakob Stoklund Olesen	900622e099	Mark i64 SETCC as expand so it is turned into a SELECT_CC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182227 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-20 00:28:36 +00:00
Jakob Stoklund Olesen	634123e98d	Don't use %g0 to materialize 0 directly. The wired physreg doesn't work on tied operands like on MOVXCC. Add a README note to fix this later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182225 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 21:47:13 +00:00
Jakob Stoklund Olesen	60abcb786e	Select i64 values with %icc conditions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182224 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 20:38:21 +00:00
Jakob Stoklund Olesen	51d46c36bc	Add floating point selects on %xcc predicates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182222 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 20:33:11 +00:00
Jakob Stoklund Olesen	89db6732fb	Implement SPselectfcc for i64 operands. Also clean up the arguments to all the MOVCC instructions so the operands always are (true-val, false-val, cond-code). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182221 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 20:20:54 +00:00
Venkatraman Govindaraju	21886a495a	[Sparc] Rearrange integer registers' allocation order so that register allocator will use I and G registers before using L and O registers. Also, enable registers %g2-%g4 to be used in application and %g5 in 64 bit mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182219 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 20:07:20 +00:00
Jakob Stoklund Olesen	00ce0f6512	Handle i64 FrameIndex nodes in SPARC v9 mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182216 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-19 19:14:24 +00:00
Hal Finkel	bf0bc3b2a2	Check InlineAsm clobbers in PPCCTRLoops We don't need to reject all inline asm as using the counter register (most does not). Only those that explicitly clobber the counter register need to prevent the transformation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182191 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-18 09:20:39 +00:00
David Majnemer	8a55c2ecd4	X86: Bad peephole interaction between adc, MOV32r0 The peephole tries to reorder MOV32r0 instructions such that they are before the instruction that modifies EFLAGS. The problem is that the peephole does not consider the case where the instruction that modifies EFLAGS also depends on the previous state of EFLAGS. Instead, walk backwards until we find an instruction that has a def for EFLAGS but does not have a use. If we find such an instruction, insert the MOV32r0 before it. If it cannot find such an instruction, skip the optimization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182184 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-18 01:02:03 +00:00
JF Bastien	bab06ba696	Support unaligned load/store on more ARM targets This patch matches GCC behavior: the code used to only allow unaligned load/store on ARM for v6+ Darwin, it will now allow unaligned load/store for v6+ Darwin as well as for v7+ on Linux and NaCl. The distinction is made because v6 doesn't guarantee support (but LLVM assumes that Apple controls hardware+kernel and therefore have conformant v6 CPUs), whereas v7 does provide this guarantee (and Linux/NaCl behave sanely). The patch keeps the -arm-strict-align command line option, and adds -arm-no-strict-align. They behave similarly to GCC's -mstrict-align and -mnostrict-align. I originally encountered this discrepancy in FastIsel tests which expect unaligned load/store generation. Overall this should slightly improve performance in most cases because of reduced I$ pressure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182175 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 23:49:01 +00:00
Vincent Lejeune	df98ad3959	R600: Lower int_load_input to copyFromReg instead of Register node It solves a bug uncovered by dot4 patch where the register class of int_load_input use was ignored. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182130 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:51:06 +00:00
Vincent Lejeune	76fc2d077f	R600: Use bottom up scheduling algorithm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182129 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:50:56 +00:00
Vincent Lejeune	21ca0b3ea4	R600: Use depth first scheduling algorithm It should increase PV substitution opportunities and lower gpr usage (pending computations path are "flushed" sooner) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182128 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:50:44 +00:00
Vincent Lejeune	4ed9917147	R600: Relax some vector constraints on Dot4. Dot4 now uses 8 scalar operands instead of 2 vectors one which allows register coalescer to remove some unneeded COPY. This patch also defines some structures/functions that can be used to handle every vector instructions (CUBE, Cayman special instructions...) in a similar fashion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182126 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:50:32 +00:00
Vincent Lejeune	d3293b49f9	R600: Improve texture handling git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182125 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:50:20 +00:00
Vincent Lejeune	4109bd8829	R600: Rename 128 bit registers. Almost all instructions that takes a 128 bits reg as input (fetch, export...) have the abilities to swizzle their argument and output. Instead of printing default swizzle for each 128 bits reg, rename T.XYZW to T and let instructions print potentially optimized swizzles themselves. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182124 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 16:50:09 +00:00
Tom Stellard	0976e3c6d9	R600: Fix encoding for R600 family GPUs Reviewed-by: Vincent Lejeune <vljn@ovi.com> https://bugs.freedesktop.org/show_bug.cgi?id=64193 https://bugs.freedesktop.org/show_bug.cgi?id=64257 https://bugs.freedesktop.org/show_bug.cgi?id=64320 NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182113 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 15:23:21 +00:00
Venkatraman Govindaraju	a65d33760b	[Sparc] Implements hasReservedCallFrame and hasFP. This is to generate correct framesetup code when the function has variable sized allocas. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182108 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 15:14:34 +00:00
Benjamin Kramer	a0de26ce34	X86: Make shuffle -> shift conversion more aggressive about undefs. Shuffles that only move an element into position 0 of the vector are common in the output of the loop vectorizer and often generate suboptimal code when SSSE3 is not available. Lower them to vector shifts if possible. We still prefer palignr over psrldq because it has higher throughput on sandybridge. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182102 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 14:48:34 +00:00
Benjamin Kramer	c032d1aca0	FileCheckize test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182101 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-17 14:48:25 +00:00
Venkatraman Govindaraju	d6b4caf291	[Sparc] Prevent instructions that defines or uses %o7 to be in call's delay slot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182063 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 23:53:29 +00:00
Akira Hatanaka	ae7e7cb3d3	[mips] Improve instruction selection for pattern (store (fp_to_sint $src), $ptr). Previously, three instructions were needed: trunc.w.s $f0, $f2 mfc1 $4, $f0 sw $4, 0($2) Now we need only two: trunc.w.s $f0, $f2 swc1 $f0, 0($2) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182053 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 21:17:15 +00:00
Rafael Espindola	529874cf0c	More test coverage for addFrameMove. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182051 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 20:50:56 +00:00
Hal Finkel	ae06fa2542	Fix cpu on test CodeGen/PowerPC/ctrloop-fp64.ll We need ppc instead of generic to override native features on ppc machines. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182049 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 20:28:05 +00:00
Rafael Espindola	7733728ac2	More addFrameMove test coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182046 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 20:00:45 +00:00
Hal Finkel	c482454e3c	Create an new preheader in PPCCTRLoops to avoid counter register clobbers Some IR-level instructions (such as FP <-> i64 conversions) are not chained w.r.t. the mtctr intrinsic and yet may become function calls that clobber the counter register. At the selection-DAG level, these might be reordered with the mtctr intrinsic causing miscompiles. To avoid this situation, if an existing preheader has instructions that might use the counter register, create a new preheader for the mtctr intrinsic. This extra block will be remerged with the old preheader at the MI level, but will prevent unwanted reordering at the selection-DAG level. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182045 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 19:58:38 +00:00
Akira Hatanaka	02e168003f	[mips] Test case for r182042. Add comment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182044 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 19:57:23 +00:00
Rafael Espindola	50f02f9d21	More test coverage for addFrameMove. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182041 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 19:44:40 +00:00
Benjamin Kramer	8401ed21aa	DAGCombine: Also shrink eq compares where the constant is exactly as large as the smaller type. if ((x & 255) == 255) before: movzbl %al, %eax cmpl $255, %eax after: cmpb $-1, %al git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182038 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 18:47:58 +00:00
Ulrich Weigand	347a5079e1	[PowerPC] Use true offset value in "memrix" machine operands This is the second part of the change to always return "true" offset values from getPreIndexedAddressParts, tackling the case of "memrix" type operands. This is about instructions like LD/STD that only have a 14-bit field to encode immediate offsets, which are implicitly extended by two zero bits by the machine, so that in effect we can access 16-bit offsets as long as they are a multiple of 4. The PowerPC back end currently handles such instructions by carrying the 14-bit value (as it will get encoded into the actual machine instructions) in the machine operand fields for such instructions. This means that those values are in fact not the true offset, but rather the offset divided by 4 (and then truncated to an unsigned 14-bit value). Like in the case fixed in r182012, this makes common code operations on such offset values not work as expected. Furthermore, there doesn't really appear to be any strong reason why we should encode machine operands this way. This patch therefore changes the encoding of "memrix" type machine operands to simply contain the "true" offset value as a signed immediate value, while enforcing the rules that it must fit in a 16-bit signed value and must also be a multiple of 4. This change must be made simultaneously in all places that access machine operands of this type. However, just about all those changes make the code simpler; in many cases we can now just share the same code for memri and memrix operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182032 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 17:58:02 +00:00
Hal Finkel	2a5e8c328e	PPC32 cannot form counter loops around i64 FP conversions On PPC32, i64 FP conversions are implemented using runtime calls (which clobber the counter register). These must be excluded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182023 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:52:41 +00:00
Rafael Espindola	3e521a5223	Add a triple to the test to try to fix the windows bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182022 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:48:46 +00:00
Rafael Espindola	c2d01fd5c2	More addFrameMove test coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182021 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:34:38 +00:00
Bill Schmidt	0d6423b476	Use new CHECK-DAG support to stabilize CodeGen/PowerPC/recipest.ll While testing some experimental code to add vector-scalar registers to PowerPC, I noticed that a couple of independent instructions were flipped by the scheduler. The new CHECK-DAG support is perfect for avoiding this problem. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182020 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:15:18 +00:00
Rafael Espindola	8da0cebc92	Add more addFrameMove test coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182019 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 16:09:54 +00:00
Rafael Espindola	3808c4d206	Add more test coverage for addFrameMove. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182017 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 15:18:50 +00:00
Ulrich Weigand	f0ef882828	[PowerPC] Report true displacement value from getPreIndexedAddressParts DAGCombiner::CombineToPreIndexedLoadStore calls a target routine to decompose a memory address into a base/offset pair. It expects the offset (if constant) to be the true displacement value in order to perform optional additional optimizations; in particular, to convert other uses of the original pointer into uses of the new base pointer after pre-increment. The PowerPC implementation of getPreIndexedAddressParts, however, simply calls SelectAddressRegImm, which returns a TargetConstant. This value is appropriate for encoding into the instruction, but it is not always usable as true displacement value: - Its type is always MVT::i32, even on 64-bit, where addresses ought to be i64 ... this causes the optimization to simply always fail on 64-bit due to this line in DAGCombiner: // FIXME: In some cases, we can be smarter about this. if (Op1.getValueType() != Offset.getValueType()) { - Its value is truncated to an unsigned 16-bit value if negative. This causes the above opimization to generate wrong code. This patch fixes both problems by simply returning the true displacement value (in its original type). This doesn't affect any other user of the displacement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182012 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 14:53:05 +00:00
Rafael Espindola	26bca5816d	Add more addFrameMove test coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182011 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 14:51:26 +00:00
Rafael Espindola	aba2d6d051	Extend test to check the .cfi instructions. I am about to refactor the calls to addFrameMove and some of the ppc ones were not being tested. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@182009 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 14:30:09 +00:00
Benjamin Kramer	d37635d2f2	Relax CHECK-NEXTs a bit to cope with atom's return nop padding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181999 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 11:46:50 +00:00
Rafael Espindola	0225d5a3af	Extend test for better coverage. Without this change nothing was covering this addFrameMove: // For 64-bit SVR4 when we have spilled CRs, the spill location // is SP+8, not a frame-relative slot. if (Subtarget.isSVR4ABI() && Subtarget.isPPC64() && (PPC::CR2 <= Reg && Reg <= PPC::CR4)) { MachineLocation CSDst(PPC::X1, 8); MachineLocation CSSrc(PPC::CR2); MMI.addFrameMove(Label, CSDst, CSSrc); continue; } git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181976 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 03:48:50 +00:00
Reed Kotler	1a2265bc01	Patch number 2 for mips16/32 floating point interoperability stubs. This creates stubs that help Mips32 functions call Mips16 functions which have floating point parameters that are normally passed in floating point registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181972 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-16 02:17:42 +00:00
David Majnemer	55a6f111fc	Set an explicit triple for this test. This allows the test to correctly check symbol names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181939 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 22:23:21 +00:00
David Majnemer	17585dc4d4	X86: Remove redundant test instructions Increase the number of instructions LLVM recognizes as setting the ZF flag. This allows us to remove test instructions that redundantly recalculate the flag. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181937 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 22:03:08 +00:00
Hal Finkel	b1fd3cd78f	Implement PPC counter loops as a late IR-level pass The old PPCCTRLoops pass, like the Hexagon pass version from which it was derived, could only handle some simple loops in canonical form. We cannot directly adapt the new Hexagon hardware loops pass, however, because the Hexagon pass contains a fundamental assumption that non-constant-trip-count loops will contain a guard, and this is not always true (the result being that incorrect negative counts can be generated). With this commit, we replace the pass with a late IR-level pass which makes use of SE to calculate the backedge-taken counts and safely generate the loop-count expressions (including any necessary max() parts). This IR level pass inserts custom intrinsics that are lowered into the desired decrement-and-branch instructions. The most fragile part of this new implementation is that interfering uses of the counter register must be detected on the IR level (and, on PPC, this also includes any indirect branches in addition to function calls). Also, to make all of this work, we need a variant of the mtctr instruction that is marked as having side effects. Without this, machine-code level CSE, DCE, etc. illegally transform the resulting code. Hopefully, this can be improved in the future. This new pass is smaller than the original (and much smaller than the new Hexagon hardware loops pass), and can handle many additional cases correctly. In addition, the preheader-creation code has been copied from LoopSimplify, and after we decide on where it belongs, this code will be refactored so that it can be explicitly shared (making this implementation even smaller). The new test-case files ctrloop-{le,lt,ne}.ll have been adapted from tests for the new Hexagon pass. There are a few classes of loops that this pass does not transform (noted by FIXMEs in the files), but these deficiencies can be addressed within the SE infrastructure (thus helping many other passes as well). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181927 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 21:37:41 +00:00
Derek Schuff	c22cdb7203	Fix miscompile due to StackColoring incorrectly merging stack slots (PR15707) IR optimisation passes can result in a basic block that contains: llvm.lifetime.start(%buf) ... llvm.lifetime.end(%buf) ... llvm.lifetime.start(%buf) Before this change, calculateLiveIntervals() was ignoring the second lifetime.start() and was regarding %buf as being dead from the lifetime.end() through to the end of the basic block. This can cause StackColoring to incorrectly merge %buf with another stack slot. Fix by removing the incorrect Starts[pos].isValid() and Finishes[pos].isValid() checks. Just doing: Starts[pos] = Indexes->getMBBStartIdx(MBB); Finishes[pos] = Indexes->getMBBEndIdx(MBB); unconditionally would be enough to fix the bug, but it causes some test failures due to stack slots not being merged when they were before. So, in order to keep the existing tests passing, treat LiveIn and LiveOut separately rather than approximating the live ranges by merging LiveIn and LiveOut. This fixes PR15707. Patch by Mark Seaborn. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181922 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 21:15:09 +00:00
Richard Sandiford	ddbf053a4c	[SystemZ] Make use of SUBTRACT HALFWORD Thanks to Ulrich Weigand for noticing that this instruction was missing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181893 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-15 15:05:29 +00:00
Arnold Schwaighofer	101a36117c	ARM ISel: Don't create illegal types during LowerMUL The transformation happening here is that we want to turn a "mul(ext(X), ext(X))" into a "vmull(X, X)", stripping off the extension. We have to make sure that X still has a valid vector type - possibly recreate an extension to a smaller type. In case of a extload of a memory type smaller than 64 bit we used create a ext(load()). The problem with doing this - instead of recreating an extload - is that an illegal type is exposed. This patch fixes this by creating extloads instead of ext(load()) sequences. Fixes PR15970. radar://13871383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181842 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 22:33:24 +00:00
Jyotsna Verma	a29a8965e2	Hexagon: Pass to replace tranfer/copy instructions into combine instruction where possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181817 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 18:54:06 +00:00
Eric Christopher	f276c70bb8	Reapply "Subtract isn't commutative, fix this for MMX psub." with a somewhat randomly chosen cpu that will minimize cpu specific differences on bots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181814 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 18:33:40 +00:00
Eric Christopher	edf0dda528	Temporarily revert "Subtract isn't commutative, fix this for MMX psub." It's causing failures on the atom bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181812 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 18:20:42 +00:00
Eric Christopher	304d73c9ee	Subtract isn't commutative, fix this for MMX psub. Patch by Andrea DiBiagio. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181809 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 17:52:05 +00:00
Jakob Stoklund Olesen	bc3db03bf0	Recognize sparc64 as an alias for sparcv9 triples. Patch by Brad Smith! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181808 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 17:47:27 +00:00
Jyotsna Verma	36e1b51438	Hexagon: Add patterns to generate 'combine' instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181805 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 17:16:38 +00:00
Jyotsna Verma	91eadc6d69	Hexagon: ArePredicatesComplement should not restrict itself to TFRs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181803 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 16:36:34 +00:00
Derek Schuff	ed788b6283	Fix ARM FastISel tests, as a first step to enabling ARM FastISel ARM FastISel is currently only enabled for iOS non-Thumb1, and I'm working on enabling it for other targets. As a first step I've fixed some of the tests. Changes to ARM FastISel tests: - Different triples don't generate the same relocations (especially movw/movt versus constant pool loads). Use a regex to allow either. - Mangling is different. Use a regex to allow either. - The reserved registers are sometimes different, so registers get allocated in a different order. Capture the names only where this occurs. - Add -verify-machineinstrs to some tests where it works. It doesn't work everywhere it should yet. - Add -fast-isel-abort to many tests that didn't have it before. - Split out the VarArg test from fast-isel-call.ll into its own test. This simplifies test setup because of --check-prefix. Patch by JF Bastien git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181801 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 16:26:38 +00:00
Bill Schmidt	ded53bf4dd	PPC32: Fix stack collision between FP and CR save areas. The changes to CR spill handling missed a case for 32-bit PowerPC. The code in PPCFrameLowering::processFunctionBeforeFrameFinalized() checks whether CR spill has occurred using a flag in the function info. This flag is only set by storeRegToStackSlot and loadRegFromStackSlot. spillCalleeSavedRegisters does not call storeRegToStackSlot, but instead produces MI directly. Thus we don't see the CR is spilled when assigning frame offsets, and the CR spill ends up colliding with some other location (generally the FP slot). This patch sets the flag in spillCalleeSavedRegisters for PPC32 so that the CR spill is properly detected and gets its own slot in the stack frame. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181800 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 16:08:32 +00:00
Jyotsna Verma	21e6ea54ff	Hexagon: Test case to check if branch probabilities are properly reflected in the jump instructions in the form of taken/not-taken hint. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181799 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 15:50:49 +00:00
Michel Danzer	5096dc74ae	R600/SI: Add lit test coverage for the remaining patterns added recently Reviewed-by: Christian König <christian.koenig@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181775 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 09:53:30 +00:00
Reed Kotler	eafa96485a	This is the first of three patches which creates stubs used for Mips16/32 floating point interoperability. When Mips16 code calls external functions that would normally have some of its parameters or return values passed in floating point registers, it needs (Mips32) helper functions to do this because while in Mips16 mode there is no ability to access the floating point registers. In Pic mode, this is done with a set of predefined functions in libc. This case is already handled in llvm for Mips16. In static relocation mode, for efficiency reasons, the compiler generates stubs that the linker will use if it turns out that the external function is a Mips32 function. (If it's Mips16, then it does not need the helper stubs). These stubs are identically named and the linker knows about these tricks and will not create multiple copies and will delete them if they are not needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181753 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 02:00:24 +00:00
Akira Hatanaka	dd29df06fa	StackColoring: don't clear an instruction's mem operand if the underlying object is a PseudoSourceValue and PseudoSourceValue::isConstant returns true (i.e., points to memory that has a constant value). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181751 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-14 01:42:44 +00:00
Bill Schmidt	240b9b6078	PPC64: Constant initializers with dynamic relocations go in .data.rel.ro. This fixes warning messages observed in the oggenc application test in projects/test-suite. Special handling is needed for the 64-bit PowerPC SVR4 ABI when a constant is initialized with a pointer to a function in a shared library. Because a function address is implemented as the address of a function descriptor, the use of copy relocations can lead to problems with initialization. GNU ld therefore replaces copy relocations with dynamic relocations to be resolved by the dynamic linker. This means the constant cannot reside in the read-only data section, but instead belongs in .data.rel.ro, which is designed for constants containing dynamic relocations. The implementation creates a class PPC64LinuxTargetObjectFile inheriting from TargetLoweringObjectFileELF, which behaves like its parent except to place constants of this sort into .data.rel.ro. The test case is reduced from the oggenc application. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181723 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-13 19:34:37 +00:00
Akira Hatanaka	42f562a169	[mips] Add option -mno-ldc1-sdc1. This option is used when the user wants to avoid emitting double precision FP loads and stores. Double precision FP loads and stores are expanded to single precision instructions after register allocation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181718 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-13 18:23:35 +00:00
Lang Hames	d26c93d3a8	Correctly preserve the input chain for potential tailcall nodes whose return values are bitcasts. The chain had previously been being clobbered with the entry node to the dag, which sometimes caused other code in the function to be erroneously deleted when tailcall optimization kicked in. <rdar://problem/13827621> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181696 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-13 10:21:19 +00:00
Hao Liu	3778c04b2e	Fix PR15950 A bug in DAG Combiner about undef mask git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181682 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-13 02:07:05 +00:00
Reed Kotler	a6e31cf69e	Add -mtriple=mipsel-linux-gnu to the test so that the compiler does not think it can support small data sections. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181654 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-11 01:02:20 +00:00
Reed Kotler	46090914b7	Checkin in of first of several patches to finish implementation of mips16/mips32 floating point interoperability. This patch fixes returns from mips16 functions so that if the function was in fact called by a mips32 hard float routine, then values that would have been returned in floating point registers are so returned. Mips16 mode has no floating point instructions so there is no way to load values into floating point registers. This is needed when returning float, double, single complex, double complex in the Mips ABI. Helper functions in libc for mips16 are available to do this. For efficiency purposes, these helper functions have a different calling convention from normal Mips calls. Registers v0,v1,a0,a1 are used to pass parameters instead of a0,a1,a2,a3. This is because v0,v1,a0,a1 are the natural registers used to return floating point values in soft float. These values can then be moved to the appropriate floating point registers with no extra cost. The only register that is modified is ra in this call. The helper functions make sure that the return values are in the floating point registers that they would be in if soft float was not in effect (which it is for mips16, though the soft float is implemented using a mips32 library that uses hard float). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181641 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 22:25:39 +00:00
Jyotsna Verma	1a35b8e2eb	Hexagon: Fix switch cases in HexagonVLIWPacketizer.cpp. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181624 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 20:27:34 +00:00
Benjamin Kramer	768ebcdf63	DAGCombiner: Generate a correct constant for vector types when folding (xor (and)) into (and (not)). PR15948. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181597 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 14:09:52 +00:00
Tom Stellard	58e87a68a8	R600: Remove AMDILPeeopholeOptimizer and replace optimizations with tablegen patterns The BFE optimization was the only one we were actually using, and it was emitting an intrinsic that we don't support. https://bugs.freedesktop.org/show_bug.cgi?id=64201 Reviewed-by: Christian König <christian.koenig@amd.com> NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181580 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 02:09:45 +00:00
Tom Stellard	dde6836456	R600: Expand SUB for v2i32/v4i32 Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181579 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 02:09:39 +00:00
Tom Stellard	6c40d40d70	R600: Expand MUL for v4i32/v2i32 Fixes piglit test for OpenCL builtin mul24, and allows mad24 to run. Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181578 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 02:09:34 +00:00
Tom Stellard	4fca5c1440	R600: Expand SRA for v4i32/v2i32 v2: Add v4i32 test Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181577 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 02:09:29 +00:00
Tom Stellard	bdd9b1e89f	R600: Expand vselect for v4i32 and v2i32 v2: Add vselect v4i32 test Patch by: Aaron Watry Reviewed-by: Tom Stellard <thomas.stellard@amd.com> Signed-off-by: Aaron Watry <awatry@gmail.com> NOTE: This is a candidate for the 3.3 branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181576 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-10 02:09:24 +00:00
Owen Anderson	58dcd200b7	Teach SelectionDAG to constant fold all-constant FMA nodes the same way that it constant folds FADD, FMUL, etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181555 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-09 22:27:13 +00:00
Bill Wendling	edfef3bd27	Generate a compact unwind encoding in the face of a stack alignment push. We generate a `push' of a random register (%rax) if the stack needs to be aligned by the size of that register. However, this could mess up compact unwind generation. In particular, we want to still generate compact unwind in the presence of this monstrosity. Check if the push of of the %rax/%eax register. If it is and it's marked with the `FrameSetup' flag, then we can generate a compact unwind encoding for the function only if the push is the last FrameSetup instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181540 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-09 20:10:38 +00:00

... 2 3 4 5 6 ...

8530 Commits