llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-04 07:29:20 +00:00

Author	SHA1	Message	Date
David Peixotto	af7c042af1	PR14992 - Tablegen incorrectly converts ARM tLDMIA_UPD pseudo to tLDMIA Fixed bug in tablegen conversion when source pseudo instruction has a different number of arguments than the destination instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175066 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 19:21:47 +00:00
Benjamin Kramer	f09e02f01a	X86: Disable generation of rep;movsl when %esi is used as a base pointer. This happens when there is both stack realignment and a dynamic alloca in the function. If we overwrite %esi (rep;movsl uses fixed registers) we'll lose the base pointer and the next register spill will write into oblivion. Fixes PR15249 and unbreaks firefox on i386/freebsd. Mozilla uses dynamic allocas and freebsd a 4 byte stack alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175057 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 13:40:35 +00:00
Reed Kotler	8080696103	Make jumptables work for -static git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175044 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 08:32:14 +00:00
Elena Demikhovsky	d29804f80d	Prevent insertion of "vzeroupper" before call that preserves YMM registers, since a caller uses preserved registers across the call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175043 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 08:02:04 +00:00
Eric Christopher	23571f4f2c	Check i1 as well as i8 variables for 8 bit registers for x86 inline assembly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175036 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 06:01:05 +00:00
Eric Christopher	a4e8694053	Finish obviously broken thought. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175035 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-13 06:01:00 +00:00
Paul Redmond	de53477c91	Fix the lit test added in r174972 Patch by: Kevin Schoedel git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174974 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-12 16:07:27 +00:00
Jyotsna Verma	6b8d2026ba	Hexagon: Add support to generate predicated absolute addressing mode instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174973 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-12 16:06:23 +00:00
Paul Redmond	5c97450df7	PR14562 - Truncation of left shift became undef DAGCombiner::ReduceLoadWidth was converting (trunc i32 (shl i64 v, 32)) into (shl i32 v, 32) into undef. To prevent this, check the shift count against the final result size. Patch by: Kevin Schoedel Reviewed by: Nadav Rotem git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174972 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-12 15:21:21 +00:00
Justin Holewinski	7eacad03ef	[NVPTX] Disable vector registers Vectors were being manually scalarized by the backend. Instead, let the target-independent code do all of the work. The manual scalarization was from a time before good target-independent support for scalarization in LLVM. However, this forces us to specially-handle vector loads and stores, which we can turn into PTX instructions that produce/consume multiple operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174968 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-12 14:18:49 +00:00
Arnold Schwaighofer	d9316dacf5	ARM NEON: Handle v16i8 and v8i16 reverse shuffles Lower reverse shuffles to a vrev64 and a vext instruction instead of the default legalization of storing and loading to the stack. This is important because we generate reverse shuffles in the loop vectorizer when we reverse store to an array. uint8_t Arr[N]; for (i = 0; i < N; ++i) Arr[N - i - 1] = ... radar://13171760 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174929 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-12 01:58:32 +00:00
Krzysztof Parzyszek	71490fa946	Extend Hexagon hardware loop generation to handle various additional cases: - variety of compare instructions, - loops with no preheader, - arbitrary lower and upper bounds. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174904 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-11 21:37:55 +00:00
Justin Holewinski	ff5adad9f3	[NVPTX] Remove NoCapture from address space conversion intrinsics. NoCapture is not valid in this case, and was causing incorrect optimizations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174896 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-11 18:56:35 +00:00
Reed Kotler	b2d1275188	Add the 16 bit version of addiu. To the assembler, the 16 and 32 bit are the same so we put in the comment field an indicator when we think we are emitting the 16 bit version. For the direct object emitter, the difference is important as well as for other passes which need an accurate count of program size. There will be other similar putbacks to this for various instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174747 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-08 21:42:56 +00:00
Hal Finkel	089a5f8a8c	DAGCombiner: Constant folding around pre-increment loads/stores Previously, even when a pre-increment load or store was generated, we often needed to keep a copy of the original base register for use with other offsets. If all of these offsets are constants (including the offset which was combined into the addressing mode), then this is clearly unnecessary. This change adjusts these other offsets to use the new incremented address. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174746 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-08 21:35:47 +00:00
Bob Wilson	8f637adbd3	Revert 172027 and 174336. Remove diagnostics about over-aligned stack objects. Aside from the question of whether we report a warning or an error when we can't satisfy a requested stack object alignment, the current implementation of this is not good. We're not providing any source location in the diagnostics and the current warning is not connected to any warning group so you can't control it. We could improve the source location somewhat, but we can do a much better job if this check is implemented in the front-end, so let's do that instead. <rdar://problem/13127907> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174741 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-08 20:35:15 +00:00
Reed Kotler	61b97b8c17	When Mips16 frames grow large, the immediate field may exceed the maximum allowed size for the instruction. This code uses RegScavenger to fix this. We sometimes need 2 registers for Mips16 so we must handle things differently than how register scavenger is normally used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174696 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-08 03:57:41 +00:00
Tom Stellard	1234c9be42	R600: Add support for SET_DX10 instructions These instructions compare two floating point values and return an integer true (-1) or false (0) value. When compiling code generated by the Mesa GLSL frontend, the SET_DX10 instructions save us four instructions for most branch decisions that use floating-point comparisons. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174609 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-07 14:02:35 +00:00
Tom Stellard	2a77cf7f47	R600: Add tests for unsupported condition codes. All of the le and lt variants are unsupported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174608 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-07 14:02:33 +00:00
Tom Stellard	b4409610a2	R600: Fix assembly name for SETGT_INT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174607 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-07 14:02:27 +00:00
Reed Kotler	24b339dcdc	Make sure we call externals from libraries properly when -static. For example, when we are doing mips16 hard float or soft float. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174583 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-07 04:34:51 +00:00
Reed Kotler	6e3443eed4	Enable jumps when in -static mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174580 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-07 03:49:51 +00:00
Eli Bendersky	16221a60a0	This is a follow-up on r174446, now taking Atom processors into account. Atoms use LEA for updating SP in prologs/epilogs, and the exact LEA opcode depends on the data model. Also reapplying the test case which was added and then reverted (because of Atom failures), this time specifying explicitly the CPU in addition to the triple. The test case now checks all variations (data mode, cpu Atom vs. Core). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174542 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-06 20:43:57 +00:00
Tim Northover	8a06229c89	Implement external weak (ELF) symbols on AArch64 Weakly defined symbols should evaluate to 0 if they're undefined at link-time. This is impossible to do with the usual address generation patterns, so we should use a literal pool entry to materlialise the address. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174518 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-06 16:43:33 +00:00
Eli Bendersky	a859afa859	Remove this test in the meantime, since it won't pass on Atom. Atom uses lea to move the stack pointer in prologs/epilogs. I will fix the test and add it back later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174484 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-06 03:15:00 +00:00
Manman Ren	9c5861fdbd	Attempt to recover gdb bot after r174445. Failure: undefined symbol 'Lline_table_start0'. Root-cause: we use a symbol subtraction to calculate at_stmt_list, but the line table entries are not dumped in the assembly. Fix: use zero instead of a symbol subtraction for Compile Unit 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174479 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-06 00:59:41 +00:00
Eli Bendersky	61b057a6fd	Test for r174446 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174464 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 23:31:48 +00:00
Manman Ren	43213cf1ac	Dwarf: support for LTO where a single object file can have multiple line tables We generate one line table for each compilation unit in the object file. Reviewed by Eric and Kevin. rdar://problem/13067005 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174445 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 21:52:47 +00:00
Akira Hatanaka	baabdecbb9	[mips] Do not use function CC_MipsN_VarArg unless the function being analyzed is a vararg function. The original code was examining flag OutputArg::IsFixed to determine whether CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this flag is often set to false when the function being analyzed is a non-variadic function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174442 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 21:18:11 +00:00
Owen Anderson	b48783b091	Reapply r174343, with a fix for a scary DAG combine bug where it failed to differentiate between the alignment of the base point of a load, and the overall alignment of the load. This caused infinite loops in DAG combine with the original application of this patch. ORIGINAL COMMIT LOG: When the target-independent DAGCombiner inferred a higher alignment for a load, it would replace the load with one with the higher alignment. However, it did not place the new load in the worklist, which prevented later DAG combines in the same phase (for example, target-specific combines) from ever seeing it. This patch corrects that oversight, and updates some tests whose output changed due to slightly different DAGCombine outputs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174431 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 19:24:39 +00:00
Jyotsna Verma	1d3d2c57f5	Hexagon: Use TFR_cond with cmpb.[eq,gt,gtu] to handle zext( set[ne,eq,gt,ugt] (...) ) type of dag patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174429 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 19:20:45 +00:00
Jyotsna Verma	f2c4db97e1	Hexagon: Add testcase for post-increment store instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174419 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 18:23:51 +00:00
Chad Rosier	1e45487dfd	[SjLj Prepare] When demoting an invoke instructions to the stack, if the normal edge is critical, then split it so we can insert the store. rdar://13126179 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174418 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 18:23:10 +00:00
Jyotsna Verma	691c365aad	Hexagon: Use multiclass for absolute addressing mode stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174412 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 18:15:34 +00:00
Jakob Stoklund Olesen	7088fb60ed	Add a test case for PR14750. This was fixed by r174402. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174405 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 18:04:15 +00:00
Tom Stellard	ebc535bc4a	R600: Add tests for instruction predicates git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174393 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 17:09:13 +00:00
Tom Stellard	3ce2ec8478	R600: Emit function name in the AsmPrinter Emitting the function name allows us to check for it in the FileCheck tests so we can make sure FileCheck is checking the output of the correct function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174392 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 17:09:11 +00:00
Jyotsna Verma	4210da7253	Hexagon: Add V4 compare instructions. Enable relationship mapping for the existing instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174389 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 16:42:24 +00:00
NAKAMURA Takumi	eb260b2527	Revert r174343, "When the target-independent DAGCombiner inferred a higher alignment for a load," It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174374 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 14:44:16 +00:00
Logan Chien	b0c899666a	Link .ARM.exidx with corresponding text section. The sh_link in the ELF section header of .ARM.exidx should be filled with the section index of the corresponding text section. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174372 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 14:18:59 +00:00
Jack Carter	37ef65b9c1	This patch that sets the EmitAlias flag in td files and enables the instruction printer to print aliased instructions. Due to usage of RegisterOperands a change in common code (utils/TableGen/AsmWriterEmitter.cpp) is required to get the correct register value if it is a RegisterOperand. Contributer: Vladimir Medic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174358 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 08:32:10 +00:00
Owen Anderson	429f7ef0c1	When the target-independent DAGCombiner inferred a higher alignment for a load, it would replace the load with one with the higher alignment. However, it did not place the new load in the worklist, which prevented later DAG combines in the same phase (for example, target-specific combines) from ever seeing it. This patch corrects that oversight, and updates some tests whose output changed due to slightly different DAGCombine outputs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174343 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-05 06:25:30 +00:00
Manman Ren	91b978e157	[Stack Alignment] emit warning instead of a hard error Per discussion in rdar://13127907, we should emit a hard error only if people write code where the requested alignment is larger than achievable and assumes the low bits are zeros. A warning should be good enough when we are not sure if the source code assumes the low bits are zeros. rdar://13127907 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174336 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 23:45:08 +00:00
Jyotsna Verma	3e1635d08c	Hexagon: Add V4 combine instructions and some more Def Pats for V2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174331 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:52:56 +00:00
Benjamin Kramer	0d3731478e	Disable a couple more vector splat optimizations on PPC. I didn't see those because the test case used "not grep". FileCheck the test and XFAIL it, preserving the old optimization, so this can be fixed eventually. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174330 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:52:32 +00:00
Benjamin Kramer	a220aeb58f	X86: Open up some opportunities for constant folding by postponing shift lowering. Fixes PR15141. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174327 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:19:33 +00:00
Benjamin Kramer	4969310052	SelectionDAG: Teach FoldConstantArithmetic how to deal with vectors. This required disabling a PowerPC optimization that did the following: input: x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16> lowered to: tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8> x = ADD tmp, tmp The add now gets folded immediately and we're back at the BUILD_VECTOR we started from. I don't see a way to fix this currently so I left it disabled for now. Fix some trivially foldable X86 tests too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174325 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-04 15:19:18 +00:00
David Blaikie	a8eefc7cc7	Remove the (apparently) unnecessary debug info metadata indirection. The main lists of debug info metadata attached to the compile_unit had an extra layer of metadata nodes they went through for no apparent reason. This patch removes that (& still passes just as much of the GDB 7.5 test suite). If anyone can show evidence as to why these extra metadata nodes are there I'm open to reverting this patch & documenting why they're there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174266 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-02 05:56:24 +00:00
Reed Kotler	63f3312355	Start static relocation implementation for mips16. This checkin makes hello world work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174264 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-02 04:07:35 +00:00
Shuxin Yang	98b93e5a94	rdar://13126763 Fix a bug in DAGCombine. The symptom is mistakenly optimizing expression "x + xx" into "x 3.0". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174239 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-02 00:22:03 +00:00
Bill Schmidt	cdc3b74cfb	LLVM enablement for some older PowerPC CPUs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174230 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-01 22:59:51 +00:00
David Sehr	693c37aa86	Two changes relevant to LEA and x32: 1) allows the use of RIP-relative addressing in 32-bit LEA instructions under x86-64 (ILP32 and LP64) 2) separates the size of address registers in 64-bit LEA instructions from control by ILP32/LP64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174208 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-01 19:28:09 +00:00
Jyotsna Verma	03b3771c6c	Hexagon: Test case to confirm generation of indexed loads with zero offset. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174196 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-01 16:40:06 +00:00
Tim Northover	7bc8414ee9	Add explicit triples to AArch64 tests Only Linux is supported at the moment, and other platforms quickly fault. As a result these tests would fail on non-Linux hosts. It may be worth making the tests more generic again as more platforms are supported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174170 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-01 11:40:47 +00:00
Tom Stellard	4bdf9890ed	R600: Fold clamp, neg, abs Patch by: Vincent Lejeune Reviewed-by: Tom Stellard <thomas.stellard@amd.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174099 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 22:11:54 +00:00
Lang Hames	2d95e43fd8	When lowering memcpys to loads and stores, make sure we don't promote alignments past the natural stack alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174085 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 20:23:43 +00:00
Tim Northover	72062f5744	Add AArch64 as an experimental target. This patch adds support for AArch64 (ARM's 64-bit architecture) to LLVM in the "experimental" category. Currently, it won't be built unless requested explicitly. This initial commit should have support for: + Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions (except the late addition CRC instructions). + CodeGen features required for C++03 and C99. + Compilation for the "small" memory model: code+static data < 4GB. + Absolute and position-independent code. + GNU-style (i.e. "__thread") TLS. + Debugging information. The principal omission, currently, is performance tuning. This patch excludes the NEON support also reviewed due to an outbreak of batshit insanity in our legal department. That will be committed soon bringing the changes to precisely what has been approved. Further reviews would be gratefully received. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174054 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 12:12:40 +00:00
Eric Christopher	a9bd4b4647	Check and allow floating point registers to select the size of the register for inline asm. This conforms to how gcc allows for effective casting of inputs into gprs (fprs is already handled). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174008 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 00:50:46 +00:00
Eli Bendersky	2acfb179fc	Replace some more greps with FileChecks in tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174006 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 00:44:12 +00:00
Eli Bendersky	ee1841cdda	Rewrite this test properly with a FileCheck instead of greps git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173997 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-31 00:11:52 +00:00
Hal Finkel	9a79b320cb	PPC QPX requires a 32-byte aligned stack On systems which support the QPX vector instructions, the stack must be 32-byte aligned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173993 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 23:43:27 +00:00
Evan Cheng	b25a645830	Forgot the test case before. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173988 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 22:57:00 +00:00
Hal Finkel	5bb16fdbb3	Add definitions for the PPC a2q core marked as having QPX available This is the first commit of a large series which will add support for the QPX vector instruction set to the PowerPC backend. This instruction set is used on the IBM Blue Gene/Q supercomputers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173973 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 21:17:42 +00:00
Eli Bendersky	0f156af831	Add a special ARM trap encoding for NaCl. More details in this thread: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130128/163783.html Patch by JF Bastien git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173943 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 16:30:19 +00:00
Logan Chien	620d5bd8e4	Add missing header and test cases for r173939. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173941 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 15:48:50 +00:00
Akira Hatanaka	25832e2aa9	[mips] Test case for r173862. Patch by Sasa Stankovic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173863 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-30 00:28:15 +00:00
Tim Northover	0adfdedacb	Fix 64-bit atomic operations in Thumb mode. The ARM and Thumb variants of LDREXD and STREXD have different constraints and take different operands. Previously the code expanding atomic operations didn't take this into account and asserted in Thumb mode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-29 09:06:13 +00:00
Bill Schmidt	5ff776bfde	This patch addresses bug 15031. The common code in the post-RA scheduler to break anti-dependencies on the critical path contained a flaw. In the reported case, an anti-dependency between the overlapping registers %X4 and %R4 exists: %X29<def> = OR8 %X4, %X4 %R4<def>, %X3<def,dead,tied3> = LBZU 1, %X3<kill,tied1> The unpatched code breaks the dependency by replacing %R4 and its uses with %R3, the first register on the available list. However, %R3 and %X3 overlap, so this creates two overlapping definitions on the same instruction. The fix is straightforward, preventing selection of a register that overlaps any other defined register on the same instruction. The test case is reduced from the bug report, and verifies that we no longer produce "lbzu 3, 1(3)" when breaking this anti-dependency. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173706 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-28 18:36:58 +00:00
Benjamin Kramer	914f8c4825	When the legalizer is splitting vector shifts, the result may not have the right shift amount type. Fix that by adding a cast to the shift expander. This came up with vector shifts on sse-less X86 CPUs. <2 x i64> = shl <2 x i64> <2 x i64> -> i64,i64 = shl i64 i64; shl i64 i64 -> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64 Now we cast the last two i64s to the right type. Fixes the crash in PR14668. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173615 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-27 11:19:11 +00:00
Benjamin Kramer	11f2bf7f15	X86: Do splat promotion later, so the optimizer can chew on it first. This catches many cases where we can emit a more efficient shuffle for a specific mask or when the mask contains undefs. Once the splat is lowered to unpacks we can't do that anymore. There is a possibility of moving the promotion after pshufb matching, but I'm not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so I avoided that for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173569 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-26 11:44:21 +00:00
Benjamin Kramer	6bbc1421ce	FileCheckize and merge some tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173568 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-26 11:14:32 +00:00
Reid Kleckner	ce98f09f53	FileCheck-ify some grep tests These tests in particular try to use escaped square brackets as an argument to grep, which is failing for me with native win32 python. It appears the backslash is being lost near the CreateProcess*() call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173506 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 22:11:46 +00:00
Eli Bendersky	a5597f0eaf	In this patch, we teach X86_64TargetMachine that it has a ILP32 (defined by the x32 ABI) mode, in which case its pointers are 32-bits in size. This knowledge is also added to X86RegisterInfo that now returns the appropriate registers in getPointerRegClass. There are many outcomes to this change. In order to keep the patches separate and manageable, we start by focusing on some simple testable cases. The patch adds a test with passing a pointer to a function - focusing on the difference between the two data models for x86-64. Another test is added for handling of 'sret' arguments (and functionality is added in X86ISelLowering to make it work). A note on naming: the "x32 ABI" document refers to the AMD64 architecture (in LLVM it's distinguished by being is64Bits() in the x86 subtarget) with two variations: the LP64 (default) data model, and the ILP32 data model. This patch adds predicates to the subtarget which are consistent with this naming scheme. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173503 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 22:07:43 +00:00
Eli Bendersky	767295f114	Now that llvm-dwarfdump supports flags to specify which DWARF section to dump, use them in tests that run llvm-dwarfdump. This is in order to make tests as specific as possible. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173498 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 21:44:53 +00:00
Silviu Baranga	4a9256f265	Fixed the condition codes for the atomic64 min/umin code generation on ARM. If the sutraction of the higher 32 bit parts gives a 0 result, we need to do the store operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173437 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 10:39:49 +00:00
Andrew Trick	4e1fb18940	MIsched: Improve the interface to SchedDFS analysis (subtrees). Allow the strategy to select SchedDFS. Allow the results of SchedDFS to affect initialization of the scheduler state. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173425 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 06:33:57 +00:00
Andrew Trick	178f7d08a4	MISched: Add SchedDFSResult to ScheduleDAGMI to formalize the interface and allow other strategies to select it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173413 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 04:01:04 +00:00
Akira Hatanaka	d2047c6001	[mips] Set flag neverHasSideEffects flag on some of the floating point instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173401 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-25 00:20:39 +00:00
Reed Kotler	8453b3f66a	The next phase of Mips16 hard float implementation. Allow Mips16 routines to call Mips32 routines that have abi requirements that either arguments or return values are passed in floating point registers. This handles only the pic case. We have not done non pic for Mips16 yet in any form. The libm functions are Mips32, so with this addition we have a complete Mips16 hard float implementation. We still are not able to complete mix Mip16 and Mips32 with hard float. That will be the next phase which will have several steps. For Mips32 to freely call Mips16 some stub functions must be created. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173320 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-24 04:24:02 +00:00
Bill Wendling	e4957fb9b7	Add the heuristic to differentiate SSPStrong from SSPRequired. The requirements of the strong heuristic are: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173231 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-23 06:43:53 +00:00
Bill Wendling	114baee1fa	Add the IR attribute 'sspstrong'. SSPStrong applies a heuristic to insert stack protectors in these situations: * A Protector is required for functions which contain an array, regardless of type or length. * A Protector is required for functions which contain a structure/union which contains an array, regardless of type or length. Note, there is no limit to the depth of nesting. * A protector is required when the address of a local variable (i.e., stack based variable) is exposed. (E.g., such as through a local whose address is taken as part of the RHS of an assignment or a local whose address is taken as part of a function argument.) This patch implements the SSPString attribute to be equivalent to SSPRequired. This will change in a subsequent patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173230 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-23 06:41:41 +00:00
Michael Liao	13d08bf415	Fix an issue of pseudo atomic instruction DAG schedule - Add list of physical registers clobbered in pseudo atomic insts Physical registers are clobbered when pseudo atomic instructions are expanded. Add them in clobber list to prevent DAG scheduler to mis-schedule them after these insns are declared side-effect free. - Add test case from Michael Kuperstein <michael.m.kuperstein@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173200 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-22 21:47:38 +00:00
Akira Hatanaka	a88322c283	[mips] Implement MipsRegisterInfo::getRegPressureLimit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173197 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-22 21:34:25 +00:00
NAKAMURA Takumi	1340833d7c	llvm/test/CodeGen/X86/win_ftol2.ll: Add -cpu=generic to appease valgrind. On valgrind the processor is reported; Host CPU: athlon-fx git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172983 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-20 15:40:02 +00:00
Nadav Rotem	0c8607ba6a	Revert 172708. The optimization handles esoteric cases but adds a lot of complexity both to the X86 backend and to other backends. This optimization disables an important canonicalization of chains of SEXT nodes and makes SEXT and ZEXT asymmetrical. Disabling the canonicalization of consecutive SEXT nodes into a single node disables other DAG optimizations that assume that there is only one SEXT node. The AVX mask optimizations is one example. Additionally this optimization does not update the cost model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172968 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-20 08:35:56 +00:00
Nadav Rotem	ba95865441	On Sandybridge split unaligned 256bit stores into two xmm-sized stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172894 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-19 08:38:41 +00:00
Jakob Stoklund Olesen	d32eea9636	Remove some register allocation order dependencies. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172874 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-19 00:03:32 +00:00
Nadav Rotem	48177ac90f	On Sandybridge loading unaligned 256bits using two XMM loads (vmovups and vinsertf128) is faster than using a single vmovups instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172868 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-18 23:10:30 +00:00
NAKAMURA Takumi	ca81374e32	llvm/test/CodeGen/X86/Atomics-64.ll: Tweak for 2nd RUN not to overwrite %t. It sometimes causes spurious failure on lit win32. Feel free to prune or suppress each output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172823 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-18 14:52:02 +00:00
Bill Schmidt	d69a43a88d	Restore reverted test case, this time with REQUIRES: asserts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172747 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 19:46:51 +00:00
Bill Schmidt	c087dcfa63	Remove bad test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172746 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 19:39:36 +00:00
Bill Schmidt	8f4ee4b2a2	This patch fixes PR13626 by providing i128 support in the return calling convention. 128-bit integers are now properly returned in GPR3 and GPR4 on PowerPC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172745 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 19:34:57 +00:00
Jyotsna Verma	a454ffd02a	Add indexed load/store instructions for offset validation check. This patch fixes bug 14902 - http://llvm.org/bugs/show_bug.cgi?id=14902 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172737 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 18:42:37 +00:00
Bill Schmidt	792b123338	This patch fixes the PPC calling convention to handle returns of _Complex float and _Complex long double, by simply increasing the number of floating point registers available for return values. The test case verifies that the correct registers are loaded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172733 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 17:45:19 +00:00
Elena Demikhovsky	6c327f92a5	Optimization for the following SIGN_EXTEND pairs: v8i8 -> v8i64, v8i8 -> v8i32, v4i8 -> v4i64, v4i16 -> v4i64 for AVX and AVX2. Bug 14865. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172708 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-17 09:59:53 +00:00
Bill Schmidt	89e88e30bf	This patch addresses an incorrect transformation in the DAG combiner. The included test case is derived from one of the GCC compatibility tests. The problem arises after the selection DAG has been converted to type-legalized form. The combiner first sees a 64-bit load that can be converted into a pre-increment form. The original load feeds into a SRL that isolates the upper 32 bits of the loaded doubleword. This looks like an opportunity for DAGCombiner::ReduceLoadWidth() to replace the 64-bit load with a 32-bit load. However, this transformation is not valid, as the replacement load is not a pre-increment load. The pre-increment load produces an extra result, which feeds a subsequent add instruction. The replacement load only has one result value, and this value is propagated to all uses of the pre- increment load, including the add. Because the add is looking for the second result value as its operand, it ends up attempting to add a constant to a token chain, resulting in a crash. So the patch simply disables this transformation for any load with more than two result values. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172480 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-14 22:04:38 +00:00
Benjamin Kramer	08219ea2b4	X86: Add patterns for X86ISD::VSEXT in registers. Those can occur when something between the sextload and the store is on the same chain and blocks isel. Fixes PR14887. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172353 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-13 11:37:04 +00:00
Benjamin Kramer	4dc478308f	When lowering an inreg sext first shift left, then right arithmetically. Shifting right two times will only yield zero. Should fix SingleSource/UnitTests/SignlessTypes/factor. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172322 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-12 19:06:44 +00:00
Nadav Rotem	66de2af815	PPC: Implement efficient lowering of sign_extend_inreg. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172269 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-11 22:57:48 +00:00
Preston Gurd	1452d46e0b	Update patch for the pad short functions pass for Intel Atom (only). Adds a check for -Oz, changes the code to not re-visit BBs, and skips over DBG_VALUE instrs. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172258 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-11 22:06:56 +00:00
Eric Christopher	fffe363493	For inline asm: - recognize string "{memory}" in the MI generation - mark as mayload/maystore when there's a memory clobber constraint. PR14859. Patch by Krzysztof Parzyszek git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172228 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-11 18:12:39 +00:00
Tim Northover	5f2801bd65	Simplify writing floating types to assembly. This removes previous special cases for each floating-point type in favour of a shared codepath. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172189 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-11 10:36:13 +00:00
NAKAMURA Takumi	0e4776ce61	llvm/test/CodeGen/X86/ms-inline-asm.ll: Fixup; Globals doesn't have leading underscore in symbol on linux. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172139 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-10 23:02:48 +00:00
Evan Cheng	4ff23d09fa	PR14896: Handle memcpy from constant string where the memcpy size is larger than the string size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172124 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-10 22:13:27 +00:00
Chad Rosier	c1ec207b61	[ms-inline asm] Add support for calling functions from inline assembly. Part of rdar://12991541 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172121 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-10 22:10:27 +00:00
Manman Ren	86441169da	Stack Alignment: throw error if we can't satisfy the minimal alignment requirement when creating stack objects in MachineFrameInfo. Add CreateStackObjectWithMinAlign to throw error when the minimal alignment can't be achieved and to clamp the alignment when the preferred alignment can't be achieved. Same is true for CreateVariableSizedObject. Will not emit error in CreateSpillStackObject or CreateStackObject. As long as callers of CreateStackObject do not assume the object will be aligned at the requested alignment, we should not have miscompile since later optimizations which look at the object's alignment will have the correct information. rdar://12713765 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@172027 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-10 01:10:10 +00:00
Evan Cheng	78ec0255d9	Fix a DAG combine bug visitBRCOND() is transforming br(xor(x, y)) to br(x != y). It cahced XOR's operands before calling visitXOR() but failed to update the operands when visitXOR changed the XOR node. rdar://12968664 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171999 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-09 20:56:40 +00:00
Nadav Rotem	1977675a50	add -march to the test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171956 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-09 07:04:23 +00:00
Nadav Rotem	13f8cf55d4	Efficient lowering of vector sdiv when the divisor is a splatted power of two constant. PR 14848. The lowered sequence is based on the existing sequence the target-independent DAG Combiner creates for the scalar case. Patch by Zvi Rackover. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171953 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-09 05:14:33 +00:00
Andrew Trick	47579cf390	MIsched: add an ILP window property to machine model. This was an experimental option, but needs to be defined per-target. e.g. PPC A2 needs to aggressively hide latency. I converted some in-order scheduling tests to A2. Hal is working on more test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171946 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-09 03:36:49 +00:00
Tim Northover	a67352d401	Specify complete triple for fp128 tests. This avoids FileCheck failing over different comment characters in assembly (notably powerpc64 on Linux vs Darwin) and should fix David's build-bot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171886 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-08 19:36:33 +00:00
Preston Gurd	c7b902e7fe	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171879 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-08 18:27:24 +00:00
Tim Northover	90f011e0ba	Allow the asm printer to print fp128 values properly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171866 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-08 16:56:23 +00:00
Bill Schmidt	5b7f9216c3	This patch addresses bug 14678 by fixing two problems in medium code model code generation. Variables addressed through a GlobalAlias were not being handled, and variables with available_externally linkage were treated incorrectly. The patch contains two new tests to verify the correct code generation for these cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171778 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-07 19:29:18 +00:00
Silviu Baranga	e97165901e	Make the MergeGlobals pass correctly handle the address space qualifiers of the global variables. We partition the set of globals by their address space, and apply the same the trasnformation as before to merge them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171730 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-07 12:31:25 +00:00
Craig Topper	f564a9389d	Fix suffix handling for parsing and printing of cvtsi2ss, cvtsi2sd, cvtss2si, cvttss2si, cvtsd2si, and cvttsd2si to match gas behavior. cvtsi2* should parse with an 'l' or 'q' suffix or no suffix at all. No suffix should be treated the same as 'l' suffix. Printing should always print a suffix. Previously we didn't parse or print an 'l' suffix. cvtt2si/cvt2si should parse with an 'l' or 'q' suffix or not suffix at all. No suffix should use the destination register size to choose encoding. Printing should not print a suffix. Original 'l' suffix issue with cvtsi2* pointed out by Michael Kuperstein. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171668 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-06 20:39:29 +00:00
Evan Cheng	700843ec2c	Fix for PR14739. It's not safe to fold a load into a call across a store. Thanks to Nick Lewycky for the initial patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171665 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-06 19:00:15 +00:00
Craig Topper	835e7bc48e	Recommit r171461 which was incorrectly reverted. Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171608 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-05 07:39:25 +00:00
Nadav Rotem	5d1f5c1737	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-05 05:42:48 +00:00
Preston Gurd	dd30b47175	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-04 20:54:54 +00:00
Akira Hatanaka	e13f441e00	[mips] MipsTargetLowering::getSetCCResultType should return a vector type if vectors are being compared. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171517 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-04 20:06:01 +00:00
Nadav Rotem	e12bf18754	Revert revision: 171467. This transformation is incorrect and makes some tests fail. Original message: Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171468 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-04 17:35:21 +00:00
Elena Demikhovsky	ab70320908	Simplified TRUNCATE operation that comes after SETCC. It is possible since SETCC result is 0 or -1. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171467 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-03 08:48:33 +00:00
Michael Gottesman	e33a8b8c2f	Revert "Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks." This reverts commit r171461 since it breaks the following tests: Clang :: Analysis/outofbound-notwork.c Clang :: Analysis/string-fail.c Clang :: CXX/basic/basic.lookup/basic.lookup.qual/p6-0x.cpp Clang :: CXX/basic/basic.lookup/basic.lookup.unqual/p15.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.fct.spec/p4.cpp Clang :: CXX/dcl.dcl/dcl.spec/dcl.stc/p10.cpp Clang :: CXX/temp/temp.param/p14.cpp Clang :: CXX/temp/temp.res/temp.dep.res/temp.point/p1.cpp Clang :: CodeGen/2009-02-13-zerosize-union-field-ppc.c Clang :: CodeGen/blocks-2.c Clang :: CodeGen/libcalls-d.c Clang :: CodeGen/libcalls-ld.c Clang :: CodeGenCXX/conversion-function.cpp Clang :: CodeGenCXX/debug-info-limit-type.cpp Clang :: CodeGenCXX/inheriting-constructor.cpp Clang :: FixIt/fixit-errors.c Clang :: FixIt/fixit-pmem.cpp Clang :: Modules/namespaces.cpp Clang :: PCH/changed-files.c Clang :: PCH/pr4489.c Clang :: PCH/source-manager-stack.c Clang :: Parser/cxx-ambig-decl-expr-xfail.cpp Clang :: SemaCXX/switch-implicit-fallthrough-cxx98.cpp Clang :: SemaTemplate/instantiate-function-1.mm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171466 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-03 08:18:30 +00:00
Craig Topper	56bc0ab095	Mark DIV/IDIV instructions hasSideEffects=1 because they can trap when dividing by 0. This is needed to keep early if conversion from moving them across basic blocks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171461 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-03 06:40:20 +00:00
Jakob Stoklund Olesen	251ed7f3e5	Fix PR14732 by handling all kinds of IMPLICIT_DEF live ranges. Most IMPLICIT_DEF instructions are removed by the ProcessImplicitDefs pass, and a few are reinserted by PHIElimination when a PHI argument is <undef>. RegisterCoalescer was assuming that all IMPLICIT_DEF live ranges look like those created by PHIElimination, and that their live range never leaves the basic block. The PR14732 test case does tricks with PHI nodes that causes a longer IMPLICIT_DEF live range to appear. This happens very rarely, but RegisterCoalescer should be able to handle it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171435 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-03 00:47:51 +00:00
Tom Stellard	d40758b24e	DAGCombiner: Avoid generating illegal vector INT_TO_FP nodes DAGCombiner::reduceBuildVecConvertToConvertBuildVec() was making two mistakes: 1. It was checking the legality of scalar INT_TO_FP nodes and then generating vector nodes. 2. It was passing the result value type to TargetLoweringInfo::getOperationAction() when it should have been passing the value type of the first operand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171420 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-02 22:13:01 +00:00
Nadav Rotem	d570f59048	AVX: Fix a bug in WidenMaskArithmetic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171397 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-02 17:40:39 +00:00
Hal Finkel	6eb7a4270b	Support ppcf128 in SelectionDAG::getConstantFP Fixes pr14751. Patch by Kai; Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171261 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-30 19:03:32 +00:00
Dmitri Gribenko	a6542923b8	Tests: rewrite 'opt ... %s' to 'opt ... < %s' so that opt does not emit a ModuleID This is done to avoid odd test failures, like the one fixed in r171243. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171250 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-30 02:33:22 +00:00
Nadav Rotem	0509db2738	AVX: Move the ZEXT/ANYEXT DAGCo optimizations to the lowering of these optimizations. The old test cases still cover all of these lowering/optimizations. The single change that we have is that now anyext does not need to zero a register, because it does not use the exact code path as the zero_extend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171178 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-28 05:45:24 +00:00
Nadav Rotem	d6fb53adb1	On AVX/AVX2 the type v8i1 is legalized to v8i16, which is an XMM sized register. In most cases we actually compare or select YMM-sized registers and mixing the two types creates horrible code. This commit optimizes some of the transition sequences. PR14657. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171148 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-27 08:15:45 +00:00
NAKAMURA Takumi	b1a3bafbf1	llvm/test/CodeGen/X86: FileCheck-ize two tests in r171083. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171084 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-26 03:19:30 +00:00
NAKAMURA Takumi	05c8fd9ee2	llvm/test/CodeGen/X86: Disable avx in two tests corresponding to r171082. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171083 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-26 03:08:55 +00:00
Hal Finkel	abdf75511b	Loosen scheduling restrictions on the PPC dcbt intrinsic As with the prefetch intrinsic to which it maps, simply have dcbt marked as reading from and writing to its arguments instead of having unmodeled side effects. While this might cause unwanted code motion (because aliasing checks don't really capture cache-line sharing), it is more important that prefetches in unrolled loops don't block the scheduler from rearranging the unrolled loop body. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171073 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 18:51:18 +00:00
Hal Finkel	cd9ea51986	Expand PPC64 atomic load and store Use of store or load with the atomic specifier on 64-bit types would cause instruction-selection failures. As with the 32-bit case, these can use the default expansion in terms of cmp-and-swap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171072 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 17:22:53 +00:00
Benjamin Kramer	50ec431c9f	Harden test so it's not affected by changes to compare lowering. This only failed on hosts that don't have SSE41. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171066 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 13:23:23 +00:00
Benjamin Kramer	99f78061e0	X86: Shave off one shuffle from the pcmpeqq sequence for SSE2 by making use of and commutativity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171064 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 13:09:08 +00:00
Benjamin Kramer	382ed78d3f	X86: Custom lower <2 x i64> eq and ne when SSE41 is not available. pcmpeqd, pshufd, pshufd, pand is cheaper than unpack + cmpq, sbbq, cmpq, sbbq + pack. Small speedup on loop-vectorized viterbi (-march=core2). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171063 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-25 12:54:19 +00:00
NAKAMURA Takumi	34cb54bea8	llvm/test/CodeGen/X86/fold-vex.ll: Add explicit triple. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171029 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-24 11:14:06 +00:00
Nadav Rotem	ace0c2fad7	Some x86 instructions can load/store one of the operands to memory. On SSE, this memory needs to be aligned. When these instructions are encoded in VEX (on AVX) there is no such requirement. This changes the folding tables and removes the alignment restrictions from VEX-encoded instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171024 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-24 09:40:33 +00:00
Benjamin Kramer	2f8a6cdfa3	X86: Turn mul of <4 x i32> into pmuludq when no SSE4.1 is available. pmuludq is slow, but it turns out that all the unpacking and packing of the scalarized mul is even slower. 10% speedup on loop-vectorized paq8p. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170985 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-22 16:07:56 +00:00
Benjamin Kramer	17347912b4	X86: Emit vector sext as shuffle + sra if vpmovsx is not available. Also loosen the SSSE3 dependency a bit, expanded pshufb + psra is still better than scalarized loads. Fixes PR14590. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170984 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-22 11:34:28 +00:00
Nadav Rotem	d0696ef8c3	In some cases, due to scheduling constraints we copy the EFLAGS. The only way to read the eflags is using push and pop. If we don't adjust the stack then we run over the first frame index. This is not something that we want to do, so we have to make sure that our machine function does not copy the flags. If it does then we have to emit the prolog that adjusts the stack. rdar://12896831 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170961 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 23:48:49 +00:00
Benjamin Kramer	4716cf4981	try to unbreak ppc buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170913 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 18:11:45 +00:00
Benjamin Kramer	2556c6b4b6	X86: Match pmin/pmax as a target specific dag combine. This occurs during vectorization. Part of PR14667. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170908 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 17:46:58 +00:00
Tom Stellard	519b456fe1	R600: Expand vec4 INT <-> FP conversions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170901 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 16:33:24 +00:00
Reed Kotler	e30843ded9	Add test case for r170674 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170823 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 00:55:10 +00:00
Eric Christopher	71a9c2137b	Move these files over to the debug info directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170810 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-21 00:03:42 +00:00
Bob Wilson	103b4a571e	Revert "Adding support for llvm.arm.neon.vaddl[su].* and" This reverts r170694. The operations can be represented in IR without adding any new intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170765 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 21:09:38 +00:00
Evan Cheng	139e407d52	On some ARM cpus, flags setting movs with shifter operand, i.e. lsl, lsr, asr, are more expensive than the non-flag setting variant. Teach thumb2 size reduction pass to avoid generating them unless we are optimizing for size. rdar://12892707 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170728 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 19:59:30 +00:00
Rafael Espindola	ffc7d3b0ad	Simplify the testcase a bit. I checked that it would still crash llc before the corresponding fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170709 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 17:47:27 +00:00
Renato Golin	332bd79951	Adding support for llvm.arm.neon.vaddl[su].* and llvm.arm.neon.vsub[su].* intrinsics. Patch by Pete Couperus <pjcoup@gmail.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170694 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 13:52:11 +00:00
Reed Kotler	cef95f702a	fix most of remaining issues with large frames. these patches are tested a lot by test-suite but make check tests are forthcoming once the next few patches that complete this are committed. with the next few patches the pass rate for mips16 is near 100% git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170656 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 04:07:42 +00:00
Akira Hatanaka	68fe665b9a	[mips] Use "or $r0, $r1, $zero" instead of "addu $r0, $zero, $r1" to copy physical register $r1 to $r0. GNU disassembler recognizes an "or" instruction as a "move", and this change makes the disassembled code easier to read. Original patch by Reed Kotler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170655 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 04:06:06 +00:00
Bob Wilson	99d8e76d44	Do not introduce vector operations in functions marked with noimplicitfloat. <rdar://problem/12879313> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170630 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-20 01:36:20 +00:00
Evan Cheng	733c6b1db1	LLVM sdisel normalize bit extraction of the form: ((x & 0xff00) >> 8) << 2 to (x >> 6) & 0x3fc This is general goodness since it folds a left shift into the mask. However, the trailing zeros in the mask prevents the ARM backend from using the bit extraction instructions. And worse since the mask materialization may require an addition instruction. This comes up fairly frequently when the result of the bit twiddling is used as memory address. e.g. = ptr[(x & 0xFF0000) >> 16] We want to generate: ubfx r3, r1, #16, #8 ldr.w r3, [r0, r3, lsl #2] vs. mov.w r9, #1020 and.w r2, r9, r1, lsr #14 ldr r2, [r0, r2] Add a late ARM specific isel optimization to ARMDAGToDAGISel::PreprocessISelDAG(). It folds the left shift to the 'base + offset' address computation; change the mask to one which doesn't have trailing zeros and enable the use of ubfx. Note the optimization has to be done late since it's target specific and we don't want to change the DAG normalization. It's also fairly restrictive as shifter operands are not always free. It's only done for lsh 1 / 2. It's known to be free on some cpus and they are most common for address computation. This is a slight win for blowfish, rijndael, etc. rdar://12870177 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170581 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 20:16:09 +00:00
Benjamin Kramer	91223a41ef	PowerPC: Expand VSELECT nodes. There's probably a better expansion for those nodes than the default for altivec, but this is better than crashing. VSELECTs occur in loop vectorizer output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170551 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 15:49:14 +00:00
Elena Demikhovsky	4b977312c7	Optimized load + SIGN_EXTEND patterns in the X86 backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170506 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:50:20 +00:00
Nadav Rotem	bf5a2c6a39	After reducing the size of an operation in the DAG we zero-extend the reduced bitwidth op back to the original size. If we reduce ANDs then this can cause an endless loop. This patch changes the ZEXT to ANY_EXTEND if the demanded bits are equal or smaller than the size of the reduced operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170505 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 07:39:08 +00:00
Craig Topper	40b4a81ab0	Teach SimplifySetCC that comparing AssertZext i1 against a constant 1 can be rewritten as a compare against a constant 0 with the opposite condition. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170495 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-19 06:12:28 +00:00
Quentin Colombet	b519351b87	Disable ARM partial flag dependency optimization at -Oz To not over constrain the scheduler for ARM in thumb mode, some optimizations for code size reduction, specific to ARM thumb, are blocked when they add a dependency (like write after read dependency). Disables this check when code size is the priority, i.e., code is compiled with -Oz. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170462 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 22:47:16 +00:00
Andrew Trick	04f52e1300	MISched: add dependence to ExitSU to model live-out latency. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170454 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 20:53:01 +00:00
Hal Finkel	ca2dd36c39	Check multiple register classes for inline asm tied registers A register can be associated with several distinct register classes. For example, on PPC, the floating point registers are each associated with both F4RC (which holds f32) and F8RC (which holds f64). As a result, this code would fail when provided with a floating point register and an f64 operand because it would happen to find the register in the F4RC class first and return that. From the F4RC class, SDAG would extract f32 as the register type and then assert because of the invalid implied conversion between the f64 value and the f32 register. Instead, search all register classes. If a register class containing the the requested register has the requested type, then return that register class. Otherwise, as before, return the first register class found that contains the requested register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170436 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-18 17:50:58 +00:00
Craig Topper	b72ae70036	Add rest of BMI/BMI2 instructions to the folding tables as well as popcnt and lzcnt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170304 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-17 05:02:29 +00:00
Reed Kotler	2c3a4641a7	This patch is needed to make c++ exceptions work for mips16. Mips16 is really a processor decoding mode (ala thumb 1) and in the same program, mips16 and mips32 functions can exist and can call each other. If a jal type instruction encounters an address with the lower bit set, then the processor switches to mips16 mode (if it is not already in it). If the lower bit is not set, then it switches to mips32 mode. The linker knows which functions are mips16 and which are mips32. When relocation is performed on code labels, this lower order bit is set if the code label is a mips16 code label. In general this works just fine, however when creating exception handling tables and dwarf, there are cases where you don't want this lower order bit added in. This has been traditionally distinguished in gas assembly source by using a different syntax for the label. lab1: ; this will cause the lower order bit to be added lab2=. ; this will not cause the lower order bit to be added In some cases, it does not matter because in dwarf and debug tables the difference of two labels is used and in that case the lower order bits subtract each other out. To fix this, I have added to mcstreamer the notion of a debuglabel. The default is for label and debug label to be the same. So calling EmitLabel and EmitDebugLabel produce the same result. For various reasons, there is only one set of labels that needs to be modified for the mips exceptions to work. These are the "$eh_func_beginXXX" labels. Mips overrides the debug label suffix from ":" to "=." . This initial patch fixes exceptions. More changes most likely will be needed to DwarfCFException to make all of this work for actual debugging. These changes will be to emit debug labels in some places where a simple label is emitted now. Some historical discussion on this from gcc can be found at: http://gcc.gnu.org/ml/gcc-patches/2008-08/msg00623.html http://gcc.gnu.org/ml/gcc-patches/2008-11/msg01273.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170279 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-16 04:00:45 +00:00
Benjamin Kramer	388fc6a988	X86: Add a couple of target-specific dag combines that turn VSELECTS into psubus if possible. We match the pattern "x >= y ? x-y : 0" into "subus x, y" and two special cases if y is a constant. DAGCombiner canonicalizes those so we first have to undo the canonicalization for those cases. The pattern occurs in gzip when the loop vectorizer is enabled. Part of PR14613. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170273 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-15 16:47:44 +00:00
Reed Kotler	ed23fa8e55	This code implements most of mips16 hardfloat as it is done by gcc. In this case, essentially it is soft float with different library routines. The next step will be to make this fully interoperational with mips32 floating point and that requires creating stubs for functions with signatures that contain floating point types. I have a more sophisticated design for mips16 hardfloat which I hope to implement at a later time that directly does floating point without the need for function calls. The mips16 encoding has no floating point instructions so one needs to switch to mips32 mode to execute floating point instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170259 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-15 00:20:05 +00:00
Nadav Rotem	0a1e914f8f	TypeLegalizer: Do not generate target specific nodes with illegal types, because we cant type-legalize them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170245 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-14 21:20:37 +00:00
Bill Schmidt	d3eb4f46f0	This patch removes some nondeterminism from direct object file output for TLS dynamic models on 64-bit PowerPC ELF. The default sort routine for relocations only sorts on the r_offset field; but with TLS, there can be two relocations with the same r_offset. For PowerPC, this patch sorts secondarily on descending r_type, which matches the behavior expected by the linker. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170237 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-14 20:28:38 +00:00
Bill Schmidt	b453e16855	This patch improves the 64-bit PowerPC InitialExec TLS support by providing for a wider range of GOT entries that can hold thread-relative offsets. This matches the behavior of GCC, which was not documented in the PPC64 TLS ABI. The ABI will be updated with the new code sequence. Former sequence: ld 9,x@got@tprel(2) add 9,9,x@tls New sequence: addis 9,2,x@got@tprel@ha ld 9,x@got@tprel@l(9) add 9,9,x@tls Note that a linker optimization exists to transform the new sequence into the shorter sequence when appropriate, by replacing the addis with a nop and modifying the base register and relocation type of the ld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170209 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-14 17:02:38 +00:00
Akira Hatanaka	ed185daba7	[mips] Do not copy GOT address to register $gp if the function being called has internal linkage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170092 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-13 03:17:29 +00:00
Evan Cheng	9a65a01eeb	Fix a bug in DAGCombiner::MatchBSwapHWord. Make sure the node has operands before referencing them. rdar://12868039 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170078 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-13 01:34:32 +00:00
Evan Cheng	a16e49d56f	Fix a logic bug in inline expansion of memcpy / memset with an overlapping load / store pair. It's not legal to use a wider load than the size of the remaining bytes if it's the first pair of load / store. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170018 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 20:43:23 +00:00
Bill Schmidt	71fe60ef10	The ordering of two relocations on the same instruction is apparently not predictable when compiled on at least one non-PowerPC host. Source of nondeterminism not apparent. Restrict the test to build on PowerPC hosts for now while looking into the issue further. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170016 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 20:29:20 +00:00
Bill Schmidt	349c2787cf	This patch implements local-dynamic TLS model support for the 64-bit PowerPC target. This is the last of the four models, so we now have full TLS support. This is mostly a straightforward extension of the general dynamic model. I had to use an additional Chain operand to tie ADDIS_DTPREL_HA to the register copy following ADDI_TLSLD_L; otherwise everything above the ADDIS_DTPREL_HA appeared dead and was removed. As before, there are new test cases to test the assembly generation, and the relocations output during integrated assembly. The expected code gen sequence can be read in test/CodeGen/PowerPC/tls-ld.ll. There are a couple of things I think can be done more efficiently in the overall TLS code, so there will likely be a clean-up patch forthcoming; but for now I want to be sure the functionality is in place. Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170003 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 19:29:35 +00:00
NAKAMURA Takumi	bd85f1004d	llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Fix possible typo(s) in CHECK-NOT lines. Found by Alexander Zinenko, thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169978 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 13:34:20 +00:00
NAKAMURA Takumi	1a7b4a967d	llvm/test/CodeGen/X86/atom-bypass-slow-division.ll: Rename symbols, s/test_/Test/g, not to mismatch "CHECK(-NOT): test". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169977 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 13:34:14 +00:00
NAKAMURA Takumi	2ab2421a4e	llvm/test/CodeGen/X86/store_op_load_fold.ll: Fix typo, s/CHECK_NEXT/CHECK-NEXT/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169957 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 01:41:01 +00:00
NAKAMURA Takumi	87de1e72cb	llvm/test/CodeGen/X86/store_op_load_fold.ll: Add explicit triple. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169956 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 01:40:56 +00:00
Manman Ren	981b96376a	DAGCombine: clamp hi bit in APInt::getBitsSet to avoid assertion rdar://12838504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169951 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 01:13:50 +00:00
Evan Cheng	61f4dfe369	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169944 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-12 00:42:09 +00:00
Tom Stellard	f98f2ce29e	Add R600 backend A new backend supporting AMD GPUs: Radeon HD2XXX - HD7XXX git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169915 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 21:25:42 +00:00
Bill Schmidt	57ac1f458a	This patch implements the general dynamic TLS model for 64-bit PowerPC. Given a thread-local symbol x with global-dynamic access, the generated code to obtain x's address is: Instruction Relocation Symbol addis ra,r2,x@got@tlsgd@ha R_PPC64_GOT_TLSGD16_HA x addi r3,ra,x@got@tlsgd@l R_PPC64_GOT_TLSGD16_L x bl __tls_get_addr(x@tlsgd) R_PPC64_TLSGD x R_PPC64_REL24 __tls_get_addr nop <use address in r3> The implementation borrows from the medium code model work for introducing special forms of ADDIS and ADDI into the DAG representation. This is made slightly more complicated by having to introduce a call to the external function __tls_get_addr. Using the full call machinery is overkill and, more importantly, makes it difficult to add a special relocation. So I've introduced another opcode GET_TLS_ADDR to represent the function call, and surrounded it with register copies to set up the parameter and return value. Most of the code is pretty straightforward. I ran into one peculiarity when I introduced a new PPC opcode BL8_NOP_ELF_TLSGD, which is just like BL8_NOP_ELF except that it takes another parameter to represent the symbol ("x" above) that requires a relocation on the call. Something in the TblGen machinery causes BL8_NOP_ELF and BL8_NOP_ELF_TLSGD to be treated identically during the emit phase, so this second operand was never visited to generate relocations. This is the reason for the slightly messy workaround in PPCMCCodeEmitter.cpp:getDirectBrEncoding(). Two new tests are included to demonstrate correct external assembly and correct generation of relocations using the integrated assembler. Comments welcome! Thanks, Bill git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169910 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 20:30:11 +00:00
Chad Rosier	1ad9253c9d	Add a triple to this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169803 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 00:51:36 +00:00
Chandler Carruth	1c49fda408	Fix a miscompile in the DAG combiner. Previously, we would incorrectly try to reduce the width of this load, and would end up transforming: (truncate (lshr (sextload i48 <ptr> as i64), 32) to i32) to (truncate (zextload i32 <ptr+4> as i64) to i32) We lost the sext attached to the load while building the narrower i32 load, and replaced it with a zext because lshr always zext's the results. Instead, bail out of this combine when there is a conflict between a sextload and a zext narrowing. The rest of the DAG combiner still optimize the code down to the proper single instruction: movswl 6(...),%eax Which is exactly what we wanted. Previously we read past the end and missed the sign extension: movl 6(...), %eax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169802 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 00:36:57 +00:00
Paul Redmond	0a0990af1c	move X86-specific test This test case uses -mcpu=corei7 so it belongs in CodeGen/X86 Reviewed by: Nadav git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169801 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 00:36:43 +00:00
Chad Rosier	425e951734	Fall back to the selection dag isel to select tail calls. This shouldn't affect codegen for -O0 compiles as tail call markers are not emitted in unoptimized compiles. Testing with the external/internal nightly test suite reveals no change in compile time performance. Testing with -O1, -O2 and -O3 with fast-isel enabled did not cause any compile-time or execution-time failures. All tests were performed on my x86 machine. I'll monitor our arm testers to ensure no regressions occur there. In an upcoming clang patch I will be marking the objc_autoreleaseReturnValue and objc_retainAutoreleaseReturnValue as tail calls unconditionally. While it's theoretically true that this is just an optimization, it's an optimization that we very much want to happen even at -O0, or else ARC applications become substantially harder to debug. Part of rdar://12553082 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169796 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-11 00:18:02 +00:00
Evan Cheng	376642ed62	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169791 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 23:21:26 +00:00
Hal Finkel	f21831073c	Use GetUnderlyingObjects in misched misched used GetUnderlyingObject in order to break false load/store dependencies, and the -enable-aa-sched-mi feature similarly relied on GetUnderlyingObject in order to ensure it is safe to use the aliasing analysis. Unfortunately, GetUnderlyingObject does not recurse through phi nodes, and so (especially due to LSR) all of these mechanisms failed for induction-variable-dependent loads and stores inside loops. This change replaces uses of GetUnderlyingObject with GetUnderlyingObjects (which will recurse through phi and select instructions) in misched. Andy reviewed, tested and simplified this patch; Thanks! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169744 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 18:49:16 +00:00
Craig Topper	48b509c773	Teach DAG combine to handle vector add/sub with vectors of all 0s. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169727 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 08:12:29 +00:00
Craig Topper	9472b4fbf9	Teach DAG combine to handle vector logical operations with vectors of all 1s or all 0s. These cases can show up when vectors are split for legalizing. Fix some tests that were dependent on these cases not being combined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169684 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-08 22:49:19 +00:00
Nadav Rotem	af59e9adbd	When we use the BLEND instruction that uses the MSB as a mask, we can remove the VSRI instruction before it since it does not affect the MSB. Thanks Craig Topper for suggesting this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169638 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 21:43:11 +00:00
Matthew Curtis	ade50dc6c7	In hexagon convertToHardwareLoop, don't deref end() iterator In particular, check if MachineBasicBlock::iterator is end() before using it to call getDebugLoc(); See also this thread on llvm-commits: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20121112/155914.html git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169634 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 21:03:15 +00:00
Nadav Rotem	e4ccfef809	X86: Prefer using VPSHUFD over VPERMIL because it has better throughput. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169624 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 19:01:13 +00:00
Tim Northover	6eb3e87df0	Added Mapping Symbols for ARM ELF Before this patch, when you objdump an LLVM-compiled file, objdump tried to decode data-in-code sections as if they were code. This patch adds the missing Mapping Symbols, as defined by "ELF for the ARM Architecture" (ARM IHI 0044D). Patch based on work by Greg Fitzgerald. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169609 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-07 16:50:23 +00:00
Dmitri Gribenko	00e97c2126	Fix typos in CHECK lines. Patch by Alexander Zinenko. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169547 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 21:24:47 +00:00
Nadav Rotem	dde785cd70	Fix a bug in the code that merges consecutive stores. Previously we did not check if loads that happen in between stores alias with the first store in the chain, only with the second store onwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169516 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 17:34:13 +00:00
Craig Topper	da92646875	Remove intrinsic specific instructions for (V)MOVQUmr with patterns pointing to the normal instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169482 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 07:31:16 +00:00
Evan Cheng	824ec7d01a	Properly fix the tes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169464 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 02:29:29 +00:00
NAKAMURA Takumi	e1ab8e3e73	llvm/test/CodeGen/ARM/extload-knownzero.ll: Try to unbreak, to add -O0. I guess Chad expects fastisel here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169463 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 02:22:58 +00:00
Chad Rosier	c9758b1366	[arm fast-isel] Make the fast-isel implementation of memcpy respect alignment. rdar://12821569 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169460 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 01:34:31 +00:00
Evan Cheng	8a7186dbc2	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169459 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-06 01:28:01 +00:00
Andrew Trick	f3329c419b	RegisterPressureTracker: fix findUseBetween to handle DebugValue git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169427 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 21:37:50 +00:00
Andrew Trick	553c42cefc	RegisterPresssureTracker: Track live physical register by unit. This is much simpler to reason about, more efficient, and fixes some corner cases involving implicit super-register defs. Fixed rdar://12797931. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169425 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 21:37:42 +00:00
Justin Holewinski	2f1086137d	[NVPTX] Fix crash with unnamed struct arguments Patch by Eric Holk git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169418 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 20:50:28 +00:00
Jyotsna Verma	61b632d9f7	Use multiclass to define store instructions with base+immediate offset addressing mode and immediate stored value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169408 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 19:32:03 +00:00
Elena Demikhovsky	226e0e6264	Simplified BLEND pattern matching for shuffles. Generate VPBLENDD for AVX2 and VPBLENDW for v16i16 type on AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169366 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 09:24:57 +00:00
Evan Cheng	4e54480531	Add x86 isel lowering logic to form bit test with inverted condition. e.g. x ^ -1. Patch by David Majnemer. rdar://12755626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169339 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-05 00:10:38 +00:00
Evan Cheng	c8e7045c8a	ARM custom lower ctpop for vector types. Patch by Pete Couperus. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169325 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 22:41:50 +00:00
Bill Wendling	9493dae613	Use the 'count' attribute to calculate the upper bound of an array. The count attribute is more accurate with regards to the size of an array. It also obviates the upper bound attribute in the subrange. We can also better handle an unbound array by setting the count to -1 instead of the lower bound to 1 and upper bound to 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169312 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 21:34:03 +00:00
Bill Schmidt	d7802bf0dd	This patch introduces initial-exec model support for thread-local storage on 64-bit PowerPC ELF. The patch includes code to handle external assembly and MC output with the integrated assembler. It intentionally does not support the "old" JIT. For the initial-exec TLS model, the ABI requires the following to calculate the address of external thread-local variable x: Code sequence Relocation Symbol ld 9,x@got@tprel(2) R_PPC64_GOT_TPREL16_DS x add 9,9,x@tls R_PPC64_TLS x The register 9 is arbitrary here. The linker will replace x@got@tprel with the offset relative to the thread pointer to the generated GOT entry for symbol x. It will replace x@tls with the thread-pointer register (13). The two test cases verify correct assembly output and relocation output as just described. PowerPC-specific selection node variants are added for the two instructions above: LD_GOT_TPREL and ADD_TLS. These are inserted when an initial-exec global variable is encountered by PPCTargetLowering::LowerGlobalTLSAddress(), and later lowered to machine instructions LDgotTPREL and ADD8TLS. LDgotTPREL is a pseudo that uses the same LDrs support added for medium code model's LDtocL, with a different relocation type. The rest of the processing is straightforward. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169281 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 16:18:08 +00:00
Bill Wendling	a7645a3c66	Add a 'count' field to the DWARF subrange. The count field is necessary because there isn't a difference between the 'lo' and 'hi' attributes for a one-element array and a zero-element array. When the count is '0', we know that this is a zero-element array. When it's >=1, then it's a normal constant sized array. When it's -1, then the array is unbounded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169218 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 06:20:49 +00:00
Manman Ren	69261a6442	Stack Alignment: when creating stack objects in MachineFrameInfo, make sure the alignment is clamped to TargetFrameLowering.getStackAlignment if the target does not support stack realignment or the option "realign-stack" is off. This will cause miscompile if the address is treated as aligned and add is replaced with or in DAGCombine. Added a bool StackRealignable to TargetFrameLowering to check whether stack realignment is implemented for the target. Also added a bool RealignOption to MachineFrameInfo to check whether the option "realign-stack" is on. rdar://12713765 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169197 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-04 00:52:33 +00:00
Nadav Rotem	a569a80e58	Allow merging multiple store sequences on the same chain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169111 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-02 17:14:09 +00:00
Eli Bendersky	e469364244	Fix an invalid regex in the test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169108 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-02 15:46:02 +00:00
Andrew Trick	657b75b994	misched: Fix RegisterPressureTracker handling of DebugVals. Assertion failed: (TopRPTracker.getPos() == RegionBegin && "bad initial Top tracker"). rdar://12790302. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169072 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:22:49 +00:00
Andrew Trick	177d87ac8d	misched: Fix the DAG builder to handle an undef operand at ExitSU. Assertion failed: (VNI && "No value to read by operand") rdar://12790267. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169071 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:22:44 +00:00
Andrew Trick	30fe61aa35	misched: Fix LiveInterval update to better handle DebugVal. Assertion failed: (itr != mi2iMap.end() && "Instruction not found in maps.") rdar://12777252. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169070 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:22:41 +00:00
Andrew Trick	67bdd42d1e	misched: fix RegionBegin when DebugValues get shuffled to the top. assert (RemainingInstrs == 0 && "Instruction count mismatch!") rdar://12776937. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169069 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:22:38 +00:00
Jakob Stoklund Olesen	8c3dccde92	Simplify REG_SEQUENCE lowering. The TwoAddressInstructionPass takes the machine code out of SSA form by expanding REG_SEQUENCE instructions into copies. It is no longer necessary to rewrite the registers used by a REG_SEQUENCE instruction because the new coalescer algorithm can do it now. REG_SEQUENCE is just converted to a sequence of sub-register copies now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169067 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-01 01:06:44 +00:00
Chad Rosier	89d86118c5	test/CodeGen/PowerPC/vec_mul.ll: Add a triple. Thanks, Hal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169026 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 19:15:10 +00:00
Sebastian Pop	cb4953089b	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169024 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 19:08:04 +00:00
Chad Rosier	75cbb00727	test/CodeGen/PowerPC/vec_mul.ll: Fix register operands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169020 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 18:29:01 +00:00
NAKAMURA Takumi	09af5b8c3e	test/CodeGen/PowerPC: Add explicit -march=ppc32. FIXME: Please add another RUN line if you would like to check also on ppc64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168999 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 13:28:31 +00:00
Adhemerval Zanella	375cbe4143	This patch fixes the Altivec addend construction for the fused multiply-add instruction (vmaddfp) to conform with IEEE to ensure the sign of a zero result when resulting product is -0.0. The -0.0 vector addend to vmaddfp is generated by a creating a vector with full bits sets and then shifting each elements by 31-bits to the left, resulting in a vector of 0x80000000 (or -0.0 as float). The 'buildvec_canonicalize.ll' was adjusted to reflect this change and the 'vec_mul.ll' was complemented with the float vector multiplication test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168998 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-30 13:05:44 +00:00
Bill Wendling	7360116048	Handle the situation where CodeGenPrepare removes a reference to a BB that has the last invoke instruction in the function. This also removes the last landing pad in an function. This is fine, but with SjLj EH code, we've already placed a bunch of code in the 'entry' block, which expects the landing pad to stick around. When we get to the situation where CGP has removed the last landing pad, go ahead and nuke the SjLj instructions from the 'entry' block. <rdar://problem/12721258> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168930 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 19:38:06 +00:00
Silviu Baranga	35b3df6e31	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168886 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 14:41:25 +00:00
Justin Holewinski	7f128ea00c	Teach the legalizer how to handle operands for VSELECT nodes If we need to split the operand of a VSELECT, it must be the mask operand. We split the entire VSELECT operand with EXTRACT_SUBVECTOR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168883 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 14:26:28 +00:00
Justin Holewinski	3d200255d5	Allow targets to prefer TypeSplitVector over TypePromoteInteger when computing the legalization method for vectors For some targets, it is desirable to prefer scalarizing <N x i1> instead of promoting to a larger legal type, such as <N x i32>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168882 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 14:26:24 +00:00
Jakob Stoklund Olesen	89bea17af2	Avoid rewriting instructions twice. This could cause miscompilations in targets where sub-register composition is not always idempotent (ARM). <rdar://problem/12758887> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168837 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 00:26:11 +00:00
Nadav Rotem	90e11dc8ad	When combining consecutive stores allow loads in between the stores, if the loads do not alias. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168832 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-29 00:00:08 +00:00
Benjamin Kramer	350c00843b	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168809 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-28 20:55:10 +00:00
Andrew Trick	8b1496c922	misched: Analysis that partitions the DAG into subtrees. This is a simple, cheap infrastructure for analyzing the shape of a DAG. It recognizes uniform DAGs that take the shape of bottom-up subtrees, such as the included matrix multiplication example. This is useful for heuristics that balance register pressure with ILP. Two canonical expressions of the heuristic are implemented in scheduling modes: -misched-ilpmin and -misched-ilpmax. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168773 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-28 05:13:28 +00:00
Andrew Trick	8f82a08673	misched: better alias analysis. This fixes a hole in the "cheap" alias analysis logic implemented within the DAG builder itself, regardless of whether proper alias analysis is enabled. It now handles this pattern produced by LSR+CodeGenPrepare. %sunkaddr1 = ptrtoint * %obj to i64 %sunkaddr2 = add i64 %sunkaddr1, %lsr.iv %sunkaddr3 = inttoptr i64 %sunkaddr2 to i32* store i32 %v, i32* %sunkaddr3 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168768 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-28 03:42:49 +00:00
Bill Schmidt	daa65f5e08	This patch makes medium code model the default for 64-bit PowerPC ELF. When the CodeGenInfo is to be created for the PPC64 target machine, a default code-model selection is converted to CodeModel::Medium provided we are not targeting the Darwin OS. Defaults for Darwin are unaffected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168747 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 23:36:26 +00:00
Chad Rosier	92a6e532b8	Add -verify-machineinstrs to these fast-isel test cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168723 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 20:49:56 +00:00
Manman Ren	39834da697	CSE: allow PerformTrivialCoalescing to check copies across basic block boundaries. Given the following case: BB0 %vreg1<def> = SUBrr %vreg0, %vreg7 %vreg2<def> = COPY %vreg7 BB1 %vreg10<def> = SUBrr %vreg0, %vreg2 We should be able to CSE between SUBrr in BB0 and SUBrr in BB1. rdar://12462006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168717 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 18:58:41 +00:00
Manman Ren	f365d3984e	X86: do not fold load instructions such as [V]MOVS[S\|D] to other instructions when the destination register is wider than the memory load. These load instructions load from m32 or m64 and set the upper bits to zero, while the folded instructions may accept m128. rdar://12721174 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168710 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 18:09:26 +00:00
Bill Schmidt	34a9d4b3b9	This patch implements medium code model support for 64-bit PowerPC. The default for 64-bit PowerPC is small code model, in which TOC entries must be addressable using a 16-bit offset from the TOC pointer. Additionally, only TOC entries are addressed via the TOC pointer. With medium code model, TOC entries and data sections can all be addressed via the TOC pointer using a 32-bit offset. Cooperation with the linker allows 16-bit offsets to be used when these are sufficient, reducing the number of extra instructions that need to be executed. Medium code model also does not generate explicit TOC entries in ".section toc" for variables that are wholly internal to the compilation unit. Consider a load of an external 4-byte integer. With small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei With medium model, it instead generates: addis 3, 2, .LC1@toc@ha ld 3, .LC1@toc@l(3) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc ei[TC],ei Here .LC1@toc@ha is a relocation requesting the upper 16 bits of the 32-bit offset of ei's TOC entry from the TOC base pointer. Similarly, .LC1@toc@l is a relocation requesting the lower 16 bits. Note that if the linker determines that ei's TOC entry is within a 16-bit offset of the TOC base pointer, it will replace the "addis" with a "nop", and replace the "ld" with the identical "ld" instruction from the small code model example. Consider next a load of a function-scope static integer. For small code model, the compiler generates: ld 3, .LC1@toc(2) lwz 4, 0(3) .section .toc,"aw",@progbits .LC1: .tc test_fn_static.si[TC],test_fn_static.si .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 For medium code model, the compiler generates: addis 3, 2, test_fn_static.si@toc@ha addi 3, 3, test_fn_static.si@toc@l lwz 4, 0(3) .type test_fn_static.si,@object .local test_fn_static.si .comm test_fn_static.si,4,4 Again, the linker may replace the "addis" with a "nop", calculating only a 16-bit offset when this is sufficient. Note that it would be more efficient for the compiler to generate: addis 3, 2, test_fn_static.si@toc@ha lwz 4, test_fn_static.si@toc@l(3) The current patch does not perform this optimization yet. This will be addressed as a peephole optimization in a later patch. For the moment, the default code model for 64-bit PowerPC will remain the small code model. We plan to eventually change the default to medium code model, which matches current upstream GCC behavior. Note that the different code models are ABI-compatible, so code compiled with different models will be linked and execute correctly. I've tested the regression suite and the application/benchmark test suite in two ways: Once with the patch as submitted here, and once with additional logic to force medium code model as the default. The tests all compile cleanly, with one exception. The mandel-2 application test fails due to an unrelated ABI compatibility with passing complex numbers. It just so happens that small code model was incredibly lucky, in that temporary values in floating-point registers held the expected values needed by the external library routine that was called incorrectly. My current thought is to correct the ABI problems with _Complex before making medium code model the default, to avoid introducing this "regression." Here are a few comments on how the patch works, since the selection code can be difficult to follow: The existing logic for small code model defines three pseudo-instructions: LDtoc for most uses, LDtocJTI for jump table addresses, and LDtocCPT for constant pool addresses. These are expanded by SelectCodeCommon(). The pseudo-instruction approach doesn't work for medium code model, because we need to generate two instructions when we match the same pattern. Instead, new logic in PPCDAGToDAGISel::Select() intercepts the TOC_ENTRY node for medium code model, and generates an ADDIStocHA followed by either a LDtocL or an ADDItocL. These new node types correspond naturally to the sequences described above. The addis/ld sequence is generated for the following cases: * Jump table addresses * Function addresses * External global variables * Tentative definitions of global variables (common linkage) The addis/addi sequence is generated for the following cases: * Constant pool entries * File-scope static global variables * Function-scope static variables Expanding to the two-instruction sequences at select time exposes the instructions to subsequent optimization, particularly scheduling. The rest of the processing occurs at assembly time, in PPCAsmPrinter::EmitInstruction. Each of the instructions is converted to a "real" PowerPC instruction. When a TOC entry needs to be created, this is done here in the same manner as for the existing LDtoc, LDtocJTI, and LDtocCPT pseudo-instructions (I factored out a new routine to handle this). I had originally thought that if a TOC entry was needed for LDtocL or ADDItocL, it would already have been generated for the previous ADDIStocHA. However, at higher optimization levels, the ADDIStocHA may appear in a different block, which may be assembled textually following the block containing the LDtocL or ADDItocL. So it is necessary to include the possibility of creating a new TOC entry for those two instructions. Note that for LDtocL, we generate a new form of LD called LDrs. This allows specifying the @toc@l relocation for the offset field of the LD instruction (i.e., the offset is replaced by a SymbolLo relocation). When the peephole optimization described above is added, we will need to do similar things for all immediate-form load and store operations. The seven "mcm-n.ll" test cases are kept separate because otherwise the intermingling of various TOC entries and so forth makes the tests fragile and hard to understand. The above assumes use of an external assembler. For use of the integrated assembler, new relocations are added and used by PPCELFObjectWriter. Testing is done with "mcm-obj.ll", which tests for proper generation of the various relocations for the same sequences tested with the external assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168708 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 17:35:46 +00:00
Ulrich Weigand	dba37a3c43	Never use .lcomm on platforms where it does not accept an alignment argument. Instead, use a pair of .local and .comm directives. This avoids spurious differences between binaries built by the integrated assembler vs. those built by the external assembler, since the external assembler may impose alignment requirements on .lcomm symbols where the integrated assembler does not. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168704 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 16:11:16 +00:00
Craig Topper	020669d53f	Revert accidental commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168687 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 08:17:04 +00:00
Craig Topper	af87dae12c	Make PrintReg constructor explicit to prevent weird implicit conversions from accidentally being triggered. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168686 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 08:14:24 +00:00
Craig Topper	2cf4fb4884	Add test cases for r168417. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168681 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 07:19:54 +00:00
Chad Rosier	277068fe40	Extend test case for r168657. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168658 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 01:10:48 +00:00
NAKAMURA Takumi	cb84142195	llvm/test/CodeGen/X86/2012-07-15-broadcastfold.ll: Loosen expression corresponding to r168627. Win32 and *bsd were affected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168651 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-27 00:48:27 +00:00
Chad Rosier	1243922fc1	Remove the X86 Maximal Stack Alignment Check pass as it is no longer necessary. This pass was conservative in that it always reserved the FP to enable dynamic stack realignment, which allowed the RA to use aligned spills for vector registers. This happens even when spills were not necessary. The RA has since been improved to use unaligned spills when necessary. The new behavior is to realign the stack if the frame pointer was already reserved for some other reason, but don't reserve the frame pointer just because a function contains vector virtual registers. Part of rdar://12719844 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168627 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-26 22:55:05 +00:00
Jakub Staszak	d642baf4be	Normalize splat 256bit vectors with 8 elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168600 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-26 19:24:31 +00:00
Eli Bendersky	a5cf16fc51	Rewrite test to not use a FileCheck variable and redefine it on the same line. In preparation for the FileCheck functionality change which will allow using a variable later on the same line. No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168588 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-26 14:09:46 +00:00
Benjamin Kramer	915558e775	PPC: MCize most of the darwin PIC emission. The last remaining bit is "bcl 20, 31, AnonSymbol", which I couldn't find the instruction definition for. Only whitespace changes in assembly output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168541 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-24 13:18:25 +00:00
Akira Hatanaka	f09a03776d	[mips] Generate big GOT code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168460 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-21 20:40:38 +00:00
Anton Korobeynikov	0ae6124034	Add support for varargs functions for msp430. Patch by Job Noorman! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168440 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-21 17:28:27 +00:00
Anton Korobeynikov	6cbeb4d839	Add support for byval args. Patch by Job Noorman! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168439 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-21 17:23:03 +00:00
Tim Northover	310f248c22	Fix physical register liveness calculations: + Take account of clobbers + Give outputs priority over inputs since they happen later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168360 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-20 09:56:11 +00:00
Elena Demikhovsky	4fe5405bdd	Intel OCL built-ins calling conventions now support MacOS 32-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168359 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-20 09:37:57 +00:00
Anton Korobeynikov	2386fc8daa	Factor out type info emission into separate routine. It turned out that ARM wants different layout of type infos. This is yet another patch in attempt to fix PR7187 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168325 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-19 21:06:26 +00:00
Jakob Stoklund Olesen	e42561ad0c	Handle mixed normal and early-clobber defs on inline asm. PR14376. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168320 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-19 19:31:10 +00:00
Andrew Trick	410fe6fe19	Use a full triple for a PPC test case for asm syntax. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168283 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-18 06:21:03 +00:00
Andrew Trick	953663a9aa	Silence the buildbots for this test while I figure out the triple git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168249 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 03:39:26 +00:00
Andrew Trick	e1f663933a	Broaden isSchedulingBoundary to check aliases of SP. On PPC the stack pointer is X1, but ADJCALLSTACK writes R1. Fixes PR14315: Register regmask dependency problem with misched. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168248 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 03:35:11 +00:00
Eli Friedman	43147afd71	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168240 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 01:52:46 +00:00
Chad Rosier	0a63b6ac79	[fast-isel] Add the -verify-machineinstrs to these test cases. The remaining test cases require fixes to fast-isel before the verifier can be enabled. Part of rdar://12594152 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168233 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 00:42:06 +00:00
Akira Hatanaka	94e472832f	Initial implementation of MipsTargetLowering::isLegalAddressingMode. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168230 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-17 00:25:41 +00:00
Weiming Zhao	e56764bad1	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168207 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	b1a392e7c5	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168200 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:15:20 +00:00
Richard Osborne	ccc015d431	Fix handling of aliases to functions. An alias to a function should use pc relative addressing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168199 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:12:38 +00:00
Justin Holewinski	2085d00d09	[NVPTX] Order global variables in def-use order before emiting them in the final assembly git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168198 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 21:03:51 +00:00
NAKAMURA Takumi	e0827d8880	llvm/test/CodeGen/X86/hipe-cc*.ll: Add explicit -mcpu, or they don't expect to pass on Atom. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168171 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 16:07:37 +00:00
Duncan Sands	dc7f174b5e	Add the Erlang/HiPE calling convention, patch by Yiannis Tsiouris. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168166 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 12:36:39 +00:00
Craig Topper	d577552c66	Use roundps/pd for llvm.ceil, llvm.trunc, llvm.rint, and llvm.nearbyint of vector types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168141 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 06:37:56 +00:00
Akira Hatanaka	a032dbd62f	[mips] Fix delay slot filler so that instructions with register operand $1 are allowed in branch delay slot. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168131 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-16 02:39:34 +00:00
Eli Friedman	846ce8ea67	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168107 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-15 22:44:27 +00:00
Adhemerval Zanella	e95ed2b7af	PowerPC: Lowering floor intrinsic for Altivec This patch lowers the llvm.floor, llvm.ceil, llvm.trunc, and llvm.nearbyint to Altivec instruction when using 4 single-precision float vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@168086 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-15 20:56:03 +00:00
Bill Schmidt	527388dea5	This patch is in preparation for adding medium code model support to the PPC64 target. The five tests modified herein test code generation that is sensitive to the code model selected. So I've added -code-model=small to the RUN commands for each. Since small code model is the default, this has no effect for now; but this prepares us for eventually changing the default to medium code model for PPC64. Test changes verified with small and medium code model as default on powerpc64-unknown-linux-gnu. All tests continue to pass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167999 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 23:23:27 +00:00
Jakub Staszak	0e52f46e48	Make sure to not get AVX code on an AVX-capable host. Revealed in r167967. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167989 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 22:24:01 +00:00
NAKAMURA Takumi	d0a4a4bf5c	test/CodeGen/Hexagon/postinc-load.ll: Suppress it for now. It triggered the failure on i686 hosts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167988 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 22:22:37 +00:00
Eric Christopher	06b423452c	Remove the CellSPU port. Approved by Chris Lattner. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167984 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 22:09:20 +00:00
NAKAMURA Takumi	9292136787	llvm/test/CodeGen/X86/memset.ll: FileCheck-ize, and add another case on +avx. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167975 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 21:01:40 +00:00
Jyotsna Verma	cb02fa9d7f	Added multiclass for post-increment load instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167974 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 20:38:48 +00:00
Benjamin Kramer	7c6e8cd7cc	Force CPU in test so we don't accidentally get AVX code on an AVX-capable host. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167973 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 20:31:42 +00:00
Benjamin Kramer	2dbe929685	X86: Enable SSE memory intrinsics even when stack alignment is less than 16 bytes. The stack realignment code was fixed to work when there is stack realignment and a dynamic alloca is present so this shouldn't cause correctness issues anymore. Note that this also enables generation of AVX instructions for memset under the assumptions: - Unaligned loads/stores are always fast on CPUs supporting AVX - AVX is not slower than SSE We may need some tweaked heuristics if one of those assumptions turns out not to be true. Effectively reverts r58317. Part of PR2962. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 20:08:40 +00:00
Nadav Rotem	50b66387e3	The code pattern "imm0_255_neg" is used for checking if an immediate value is a small negative number. This patch changes the definition of negative from -0..-255 to -1..-255. I am changing this because of a bug that we had in some of the patterns that assumed that "subs" of zero does not set the carry flag. rdar://12028498 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167963 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 19:39:15 +00:00
Justin Holewinski	a20067b5d4	[NVPTX] Implement custom lowering of loads/stores for i1 Loads from i1 become loads from i8 followed by trunc Stores to i1 become zext to i8 followed by store to i8 Fixes PR13291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167948 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 19:19:16 +00:00
Anton Korobeynikov	062a6c8380	Fix really stupid ARM EHABI info generation bug: we should not emit eh table and handler data if there are no landing pads in the function. Patch by Logan Chien with some cleanups from me. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167945 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 19:13:30 +00:00
Rafael Espindola	8e2b8ae3b1	Handle DAG CSE adding new uses during ReplaceAllUsesWith. Fixes PR14333. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167912 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 05:08:56 +00:00
Anton Korobeynikov	25efd6d556	Use TARGET2 relocation for TType references on ARM. Do some cleanup of the code while here. Inspired by patch by Logan Chien! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167904 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-14 01:47:00 +00:00
Eric Christopher	242343d1ab	Revert "Use the 'count' attribute instead of the 'upper_bound' attribute." temporarily as it is breaking the gdb bots. This reverts commit r167806/e7ff4c14b157746b3e0228d2dce9f70712d1c126. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167886 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 23:30:43 +00:00
Manman Ren	2adc503f29	X86: when constructing VZEXT_LOAD from other loads, makes sure its output chain is correctly setup. As an example, if the original load must happen before later stores, we need to make sure the constructed VZEXT_LOAD is constrained to be before the stores. rdar://12684358 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167859 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 19:13:05 +00:00
Ulrich Weigand	b64e2115de	Do not consider a machine instruction that uses and defines the same physical register as candidate for common subexpression elimination in MachineCSE. This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc caused by MachineCSE invalidly merging two separate DYNALLOC insns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167855 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 18:40:58 +00:00
Duncan Sands	b2df01ab2a	Codegen support for arbitrary vector getelementptrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167830 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 13:01:58 +00:00
Bill Wendling	e7ff4c14b1	Use the 'count' attribute instead of the 'upper_bound' attribute. If we have a type 'int a[1]' and a type 'int b[0]', the generated DWARF is the same for both of them because we use the 'upper_bound' attribute. Instead use the 'count' attrbute, which gives the correct number of elements in the array. <rdar://problem/12566646> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167806 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 02:31:47 +00:00
Andrew Trick	f546ac5f9b	Cleanup the main RegisterCoalescer loop. Block priorities still apply outside loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167793 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-13 00:34:44 +00:00
Michael Liao	01c6de341c	Fix test case added in patch fixing PR14314 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167769 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 22:33:18 +00:00
Andrew Trick	ae692f2bae	misched: Infrastructure for weak DAG edges. This adds support for weak DAG edges to the general scheduling infrastructure in preparation for MachineScheduler support for heuristics based on weak edges. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167738 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 19:28:57 +00:00
Michael Liao	dd3383fd09	Fix PR14314 - Fix operand order for atomic sub, where the minuend is the value loaded from memory and the subtrahend is the parameter specified. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167718 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 06:49:17 +00:00
Justin Holewinski	08e9cb46fe	[NVPTX] Add more precise PTX/SM target attributes Each SM and PTX version is modeled as a subtarget feature/CPU. Additionally, PTX 3.1 is added as the default PTX version to be out-of-the-box compatible with CUDA 5.0. Available CPUs for this target: sm_10 - Select the sm_10 processor. sm_11 - Select the sm_11 processor. sm_12 - Select the sm_12 processor. sm_13 - Select the sm_13 processor. sm_20 - Select the sm_20 processor. sm_21 - Select the sm_21 processor. sm_30 - Select the sm_30 processor. sm_35 - Select the sm_35 processor. Available features for this target: ptx30 - Use PTX version 3.0. ptx31 - Use PTX version 3.1. sm_10 - Target SM 1.0. sm_11 - Target SM 1.1. sm_12 - Target SM 1.2. sm_13 - Target SM 1.3. sm_20 - Target SM 2.0. sm_21 - Target SM 2.1. sm_30 - Target SM 3.0. sm_35 - Target SM 3.5. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167699 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-12 03:16:43 +00:00
Evan Cheng	785500618a	Convert an improper CodeGen test to a MC test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167663 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-10 04:30:40 +00:00
Evan Cheng	2f69102b8d	xfail a bad test. This is a MC test but it's dependent on a codegen optimization which is now disabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167658 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-10 02:34:36 +00:00
Evan Cheng	b341fac05a	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167657 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-10 02:09:05 +00:00
Craig Topper	9c7ae01f39	Cleanup pcmp(e/i)str(m/i) instruction definitions and load folding support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167652 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-10 01:23:36 +00:00
Justin Holewinski	89443ff7ae	[NVPTX] Use ABI alignment for parameters when alignment is not specified. Affects SM 2.0+. Fixes bug 13324. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167646 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-09 23:50:24 +00:00
Jakob Stoklund Olesen	722c9a7925	Fix assertions in updateRegMaskSlots(). The RegMaskSlots contains 'r' slots while NewIdx and OldIdx are 'B' slots. This broke the checks in the assertions. This fixes PR14302. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167625 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-09 19:18:49 +00:00
Amara Emerson	214fd3d244	Recommit modified r167540. Improve ARM build attribute emission for architectures types. This also changes the default architecture emitted for a generic CPU to "v7". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167574 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-08 09:51:45 +00:00
Michael Liao	be02a90de1	Add support of RTM from TSX extension - Add RTM code generation support throught 3 X86 intrinsics: xbegin()/xend() to start/end a transaction region, and xabort() to abort a tranaction region git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167573 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-08 07:28:54 +00:00
Akira Hatanaka	e90a3bcae1	[mips] Custom-lower ISD::FRAME_TO_ARGS_OFFSET node. Patch by Sasa Stankovic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167548 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-07 19:10:58 +00:00
Andrew Trick	3b87f6204f	misched: Heuristics based on the machine model. misched is disabled by default. With -enable-misched, these heuristics balance the schedule to simultaneously avoid saturating processor resources, expose ILP, and minimize register pressure. I've been analyzing the performance of these heuristics on everything in the llvm test suite in addition to a few other benchmarks. I would like each heuristic check to be verified by a unit test, but I'm still trying to figure out the best way to do that. The heuristics are still in considerable flux, but as they are refined we should be rigorous about unit testing the improvements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167527 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-07 07:05:09 +00:00
Ulrich Weigand	86aef0a4f0	On PowerPC64, integer return values (as well as arguments) are supposed to be extended to a full register. This is modeled in the IR by marking the return value (or argument) with a signext or zeroext attribute. However, while these attributes are respected for function arguments, they are currently ignored for function return values by the PowerPC back-end. This patch updates PPCCallingConv.td to ask for the promotion to i64, and fixes LowerReturn and LowerCallResult to implement it. The new test case verifies that both arguments and return values are properly extended when passing them; and also that the optimizers understand incoming argument and return values are in fact guaranteed by the ABI to be extended. The patch caused a spurious breakage in CodeGen/PowerPC/coalesce-ext.ll, since the test case used a "ret" instruction to create a use of an i32 value at the end of the function (to set up data flow as required for what the test is intended to test). Since there's now an implicit promotion to i64, that data flow no longer works as expected. To fix this, this patch now adds an extra "add" to ensure we have an appropriate use of the i32 value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167396 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-05 19:39:45 +00:00
Hal Finkel	827b7a070d	Add support for the PowerPC-specific inline asm Z constraint and y modifier. The Z constraint specifies an r+r memory address, and the y modifier expands to the "r, r" in the asm string. For this initial implementation, the base register is forced to r0 (which has the special meaning of 0 for r+r addressing on PowerPC) and the full address is taken in the second register. In the future, this should be improved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167388 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-05 18:18:42 +00:00
Adhemerval Zanella	cfe09ed28d	[PATCH] PowerPC: Expand load extend vector operations This patch expands the SEXTLOAD, ZEXTLOAD, and EXTLOAD operations for vector types when altivec is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167386 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-05 17:15:56 +00:00
Akira Hatanaka	3c77033a90	[mips] Set flag neverHasSideEffects flag on floating point conversion instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167348 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-03 00:53:12 +00:00
Akira Hatanaka	3c9c1ab7b7	[mips] Set flag isAsCheapAsAMove flag on instruction LUi. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167345 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-03 00:26:02 +00:00
Akira Hatanaka	11a45c214c	[mips] Stop reserving register AT and use register scavenger when a scratch register is needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167341 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-03 00:05:43 +00:00
Akira Hatanaka	fb60a4e823	[mips] Fix bug in test case. Disable machine LICM to prevent instruction from being moved out of a basic block. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167322 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-02 21:46:42 +00:00
Quentin Colombet	43934aee71	Vext Lowering was missing opportunities git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167318 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-02 21:32:17 +00:00
Akira Hatanaka	5c87b732f2	[mips] Use register number instead of name to print register $AT. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167315 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-02 21:26:03 +00:00
Akira Hatanaka	173192fa71	[mips] Delete MipsFunctionInfo::EmitNOAT. Unconditionally print directive "set .noat" so that the assembler doesn't issue warnings when register $AT is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167310 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-02 20:56:25 +00:00
NAKAMURA Takumi	b75111f1ef	test/CodeGen/X86/fp-fast.ll: Add +avx. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167207 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-01 02:13:45 +00:00
Owen Anderson	607ebde651	Add a few more simple fast-math constant propagations and cancellations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167200 91177308-0d34-0410-b5e6-96231b3b80d8	2012-11-01 02:00:53 +00:00
Shuxin Yang	a5526a9bff	(For X86) Enhancement to add-carray/sub-borrow (adc/sbb) optimization. The adc/sbb optimization is to able to convert following expression into a single adc/sbb instruction: (ult) ... = x + 1 // where the ult is unsigned-less-than comparison (ult) ... = x - 1 This change is to flip the "x >u y" (i.e. ugt comparison) in order to expose the adc/sbb opportunity. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167180 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-31 23:11:48 +00:00
Akira Hatanaka	497204a94b	[mips] Set isAsCheapAsAMove flag on ADDiu and DADDiu, which enables re-materialization of immediate loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167153 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-31 18:37:55 +00:00
Akira Hatanaka	39fa606992	Test case for r167039. Check that tail-call optimization is disabled for mips16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167139 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-31 17:25:23 +00:00
Reed Kotler	9441125d63	Implement ADJCALLSTACKUP and ADJCALLSTACKDOWN git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167107 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-31 05:21:10 +00:00
Bill Schmidt	42d43351b2	This patch addresses an ABI compatibility issue with empty aggregate parameters. Examples of these are: struct { } a; union { } b[256]; int a[0]; An empty aggregate has an address, although dereferencing that address is pointless. When passed as a parameter, an empty aggregate does not consume a protocol register, nor does it consume a doubleword in the parameter save area. Passing an empty aggregate by reference passes an address just as for any other aggregate. Returning an empty aggregate uses GPR3 as a hidden address of the return value location, just as for any other aggregate. The patch modifies PPCTargetLowering::LowerFormalArguments_64SVR4 and PPCTargetLowering::LowerCall_64SVR4 to properly skip empty aggregate parameters passed by value. The handling of return values and by-reference parameters was already correct. Built on powerpc64-unknown-linux-gnu and tested with no new regressions. A test case is included to test proper handling of empty aggregate parameters on both sides of the function call protocol. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167090 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-31 01:15:05 +00:00
Manman Ren	dfd0b9b460	X86 SSE: update rsqrtss and rcpss to use two source operands and the first source operand is tied to the destination operand. This is to accurately model the corresponding instructions where the upper bits are unmodified. rdar://12558838 PR14221 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167064 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 23:53:59 +00:00
Manman Ren	4c74a956b2	X86 MMX: optimize transfer from mmx to i32 We used to generate a store (movq) + a load. Now we use movd. rdar://9946746 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167056 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 22:15:38 +00:00
Akira Hatanaka	2f34d754d0	[mips] Allow tail-call optimization for vararg functions and functions which use the caller's stack. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167048 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 20:16:31 +00:00
Adhemerval Zanella	c83b5dc625	PowerPC: Expand FSRQT for vector types This patch expands FSQRT for floating point vector types when altivec is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167034 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 18:29:42 +00:00
Quentin Colombet	9a419f656e	Change ForceSizeOpt attribute into MinSize attribute git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167020 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 16:32:52 +00:00
Adhemerval Zanella	5f41fd685b	PowerPC: More support for Altivec compare operations This patch adds more support for vector type comparisons using altivec. It adds correct support for v16i8, v8i16, v4i32, and v4f32 vector types for comparison operators ==, !=, >, >=, <, and <=. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167015 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 13:50:19 +00:00
Hans Wennborg	04d7d13d30	Use TargetTransformInfo to control switch-to-lookup table transformation When the switch-to-lookup tables transform landed in SimplifyCFG, it was pointed out that this could be inappropriate for some targets. Since there was no way at the time for the pass to know anything about the target, an awkward reverse-transform was added in CodeGenPrepare that turned lookup tables back into switches for some targets. This patch uses the new TargetTransformInfo to determine if a switch should be transformed, and removes CodeGenPrepare::ConvertLoadToSwitch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167011 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 11:23:25 +00:00
Reed Kotler	c09856b535	Change mips16 delay slot jumps to non delay slot forms by default. We will make them delay slot forms if there is something that can be placed in the delay slot during a separate pass. Mips16 extended instructions cannot be placed in delay slots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166990 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 00:54:49 +00:00
Jakub Staszak	a24262a0f5	Re-commit r166971. I reverted it to quickly, when buildbots didn't have a chance to test it with chapni's fix (-mattr=+avx). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166985 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-30 00:01:57 +00:00
Jakub Staszak	c1ed096b6b	Revert r166971. It causes buildbot failure. To be investigated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166979 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 23:13:50 +00:00
NAKAMURA Takumi	926dd447f1	llvm/test/CodeGen/X86/vec_shuffle-30.ll: Try to unbreak builds - assuming +avx. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166974 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 22:45:18 +00:00
Jakub Staszak	6d317824a5	Allow to fold vector load if there is more than one bitcast, so in the case: %0 = load <8 x i16>* %dest %1 = shufflevector <8 x i16> %0, <8 x i16> %in, <8 x i32> < i32 0, i32 1, i32 2, i32 3, i32 13, i32 undef, i32 14, i32 14> store <8 x i16> %1, <8 x i16>* %dest We get: vmovlpd (%eax), %xmm0, %xmm0 instead of: vmovaps (%eax), %xmm1 vmovsd %xmm1, %xmm0, %xmm0 No extra test-case is added. I just fixed the existing one (also it uses FileCheck now). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166971 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 21:56:35 +00:00
Bill Schmidt	e6c56433de	This patch solves a problem with passing varargs parameters under the PPC64 ELF ABI. A varargs parameter consisting of a single-precision floating-point value, or of a single-element aggregate containing a single-precision floating-point value, must be passed in the low-order (rightmost) four bytes of the doubleword stack slot reserved for that parameter. If there are GPR protocol registers remaining, the parameter must also be mirrored in the low-order four bytes of the reserved GPR. Prior to this patch, such parameters were being passed in the high-order four bytes of the stack slot and the mirrored GPR. The patch adds a new test case to verify the correct code generation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166968 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 21:18:16 +00:00
Reed Kotler	576b1dbbef	Implement patterns for extloadi8 and extloadi16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166960 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 19:39:04 +00:00
Ulrich Weigand	e669c930a6	In various places throughout the code generator, there were special checks to avoid performing compile-time arithmetic on PPCDoubleDouble. Now that APFloat supports arithmetic on PPCDoubleDouble, those checks are no longer needed, and we can treat the type like any other. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166958 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 18:35:49 +00:00
Chad Rosier	53e216b304	Remove redundant test case from r166949, per Eli's suggestion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166953 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 18:18:26 +00:00
Chad Rosier	2fbc239e4f	[ms-inline asm] Add support for the [] operator. Essentially, [expr1][expr2] is equivalent to [expr1 + expr2]. See test cases for more examples. rdar://12470392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166949 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 18:01:54 +00:00
Michael Liao	2a2263e744	Fix PR14204 - Add missing pattern on X86ISD::VZEXT from VR256 to VR256 when AVX2 is enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166947 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 17:57:12 +00:00
Jakob Stoklund Olesen	573303e62a	Completely disallow partial copies in adjustCopiesBackFrom(). Partial copies can show up even when CoalescerPair.isPartial() returns false. For example: %vreg24:dsub_0<def> = COPY %vreg31:dsub_0; QPR:%vreg24,%vreg31 Such a partial-partial copy is not good enough for the transformation adjustCopiesBackFrom() needs to do. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166944 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 17:51:52 +00:00
Ulrich Weigand	78dab643e0	Allow i32/i64 for 'f' constraint on PowerPC. This fixes PR12757. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166943 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 17:49:34 +00:00
Reed Kotler	8834a20d5d	Expand all atomic ops for mips16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166935 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 16:16:54 +00:00
Preston Gurd	a836563e32	This patch addresses a problem with the Post RA scheduler generating an incorrect instruction sequence due to it not being aware that an inline assembly instruction may reference memory. This patch fixes the problem by causing the scheduler to always assume that any inline assembly code instruction could access memory. This is necessary because the internal representation of the inline instruction does not include any information about memory accesses. This should fix PR13504. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166929 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 15:01:23 +00:00
Bill Schmidt	01d013ec04	This patch adds alignment information for long double to the 64-bit PowerPC ELF subtarget. The existing logic is used as a fallback to avoid any changes to the Darwin ABI. PPC64 ELF now has two possible data layout strings: one for FreeBSD, which requires 8-byte alignment, and a default string that requires 16-byte alignment. I've added a test for PPC64 Linux to verify the 16-byte alignment. If somebody wants to add a separate test for FreeBSD, that would be great. Note that there is a companion patch to update the alignment information in Clang, which I am committing now as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166928 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-29 14:59:36 +00:00
Reed Kotler	3a9f4568fb	Implement brind operator for mips16. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166903 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-28 23:08:07 +00:00
Reed Kotler	f99998a2b0	This patch is for the implementation of mips16 complex pattern addr16. Previously mips16 was sharing the pattern addr which is used for mips32 and mips64. This had a number of problems: 1) Storing and loading byte and halfword quantities for mips16 has particular problems due to the primarily non mips16 nature of SP. When we must load/store byte/halfword stack objects in a function, we must create a mips16 alias register for SP. This functionality is tested in stchar.ll. 2) We need to have an FP register under certain conditions (such as dynamically sized alloca). We use mips16 register S0 for this purpose. In this case, we also use this register when accessing frame objects so this issue also affects the complex pattern addr16. This functionality is tested in alloca16.ll. The Mips16InstrInfo.td has been updated to use addr16 instead of addr. The complex pattern C++ function for addr has been copied to addr16 and updated to reflect the above issues. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166897 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-28 06:02:37 +00:00
Jakob Stoklund Olesen	163f67f4d9	Never attempt to join an early-clobber def with a regular kill. This fixes PR14194. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166880 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 17:41:27 +00:00
Quentin Colombet	80acd97266	[code size][ARM] Emit regular call instructions instead of the move, branch sequence git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166854 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 01:10:17 +00:00
Reed Kotler	7797e8f901	Implement MipsHi for mips16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166852 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 00:57:14 +00:00
Akira Hatanaka	21a9a98b77	[mips] Do not tail-call optimize vararg functions or functions with byval arguments. This is rather conservative and should be fixed later to be more aggressive. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166851 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 00:56:56 +00:00
Akira Hatanaka	4618e0b574	[mips] Make sure FuncArg doesn't advance when OrigArgIndex is the same as in the previous iteration. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166850 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-27 00:44:39 +00:00
Jakob Stoklund Olesen	17f42e02a1	Revert r163298 "Optimize codegen for VSETLNi{8,16,32} operating on Q registers." Keep the integer_insertelement test case, the new coalescer can handle this kind of lane insertion without help from pseudo-instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166835 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 23:39:46 +00:00
Reed Kotler	25424154f4	implement mips16 tls global addr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166827 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 22:57:32 +00:00
Jakob Stoklund Olesen	cd275f5687	Add GPRPair Register class to ARM. Some instructions in ARM require 2 even-odd paired GPRs. This patch adds support for such register class. Patch by Weiming Zhao! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166816 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 21:29:15 +00:00
Reed Kotler	a81be80b0e	Implement carry for subtract/add for mips16 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166755 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 04:46:26 +00:00
Reed Kotler	28a6214b59	implement large (>16 bit) constant loading. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166749 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 03:09:34 +00:00
Reed Kotler	30675b12fd	fix test setgek.ll so that it will not give false "make check" failure in some cases git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166747 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-26 01:29:42 +00:00
Reed Kotler	4c5a6dab17	implement mips16 patterns for select nodes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166721 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-25 21:33:30 +00:00
Michael Liao	8d7cd1d8fc	Add test for ATOM ISA SSSE3 - Remove SSE4.1 feature in other ATOM-based test cases git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166699 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-25 17:50:05 +00:00
Bill Schmidt	37900c5dcb	This patch addresses a PPC64 ELF issue with passing parameters consisting of structs having size 3, 5, 6, or 7. Such a struct must be passed and received as right-justified within its register or memory slot. The problem is only present for structs that are passed in registers. Previously, as part of a patch handling all structs of size less than 8, I added logic to rotate the incoming register so that the struct was left- justified prior to storing the whole register. This was incorrect because the address of the parameter had already been adjusted earlier to point to the right-adjusted value in the storage slot. Essentially I had accidentally accounted for the right-adjustment twice. In this patch, I removed the incorrect logic and reorganized the code to make the flow clearer. The removal of the rotates changes the expected code generation, so test case structsinregs.ll has been modified to reflect this. I also added a new test case, jaggedstructs.ll, to demonstrate that structs of these sizes can now be properly received and passed. I've built and tested the code on powerpc64-unknown-linux-gnu with no new regressions. I also ran the GCC compatibility test suite and verified that earlier problems with these structs are now resolved, with no new regressions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166680 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-25 13:38:09 +00:00
Elena Demikhovsky	c0cd72204d	The test avx-intel-ocl.ll failed. I can't reproduce on any of my machines. I added -mcpu flag, may be it will fix the problem git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166669 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-25 08:38:42 +00:00
Chad Rosier	b3009eec47	[ms-inline asm] Add back-end test case for r166632. Make sure we emit the correct .s output as well as get the correct encoding by the integrated assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166638 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 23:10:28 +00:00
Evan Cheng	d258eb3ec5	Fix a miscompilation caused by a typo. When turning a adde with negative value into a sbc with a positive number, the immediate should be complemented, not negated. Also added a missing pattern for ARM codegen. rdar://12559385 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166613 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 19:53:01 +00:00
Elena Demikhovsky	3575222175	Special calling conventions for Intel OpenCL built-in library. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166566 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 14:46:16 +00:00
Michael Liao	1a5cc710ee	Teach DAG combine to fold (buildvec (Xint2fp x)) to (Xint2fp (buildvec x)) - If more than 1 elemennts are defined and target supports the vectorized conversion, use the vectorized one instead to reduce the strength on conversion operation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166546 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 04:14:18 +00:00
Michael Liao	991b6a22b6	Add custom conversion from v2u32 to v2f32 in 32-bit mode - As there's no 64-bit GPRs in 32-bit mode, a custom conversion from v2u32 to v2f32 is added to improve the efficiency of the code generated. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166545 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 04:09:32 +00:00
Akira Hatanaka	2ef5bd3ba6	[mips] Make sure sret argument is returned in register V0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166539 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 02:10:54 +00:00
Rafael Espindola	847a9c6d77	Change x86_fastcallcc to require inreg markers. This allows it to known the difference from "int x" (which should go in registers and "struct y {int x;}" (which should not). Clang will be updated in the next patches. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166536 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-24 01:58:48 +00:00
Michael Liao	0787274b70	Fix PR14161 - Check index being extracted to be constant 0 before simplfiying. Otherwise, retain the original sequence. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166504 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-23 21:40:15 +00:00
Michael Liao	d9d09600ee	Enable lowering ZERO_EXTEND/ANY_EXTEND to PMOVZX from SSE4.1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166486 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-23 17:34:00 +00:00
Reed Kotler	293e5d0e0a	implement setXX patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166459 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-23 01:35:48 +00:00
Bill Wendling	8f47fc8f00	When a block ends in an indirect branch, add its successors to the machine basic block. The CFG of the machine function needs to know that the targets of the indirect branch are successors to the indirect branch. <rdar://problem/12529625> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166448 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-22 23:30:04 +00:00
Akira Hatanaka	30580cea43	[mips] Use 64-bit registers to return an sret pointer if target ABI is N64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166344 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 22:11:40 +00:00
Akira Hatanaka	2b861be96e	[mips] Add code to do tail call optimization. Currently, it is enabled only if option "enable-mips-tail-calls" is given and all of the callee's arguments are passed in registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166342 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 21:47:33 +00:00
Shuxin Yang	970755e519	This patch is to fix radar://8426430. It is about llvm support of __builtin_debugtrap() which is supposed to consistently raise SIGTRAP across all systems. In contrast, __builtin_trap() behave differently on different systems. e.g. it raises SIGTRAP on ARM, and SIGILL on X86. The purpose of __builtin_debugtrap() is to consistently provide "trap" functionality, in the mean time preserve the compatibility with on gcc on __builtin_trap(). The X86 backend is already able to handle debugtrap(). This patch is to: 1) make front-end recognize "__builtin_debugtrap()" (emboddied in the one-line change to Clang). 2) In DAG legalization phase, by default, "debugtrap" will be replaced with "trap", which make the __builtin_debugtrap() "available" to all existing ports without the hassle of changing their code. 3) If trap-function is specified (via -trap-func=xyz to llc), both __builtin_debugtrap() and __builtin_trap() will be expanded into the function call of the specified trap function. This behavior may need change in the future. The provided testing-case is to make sure 2) and 3) are working for ARM port, and we already have a testing case for x86. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166300 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 20:11:16 +00:00
Michael Liao	facace808c	Lower BUILD_VECTOR to SHUFFLE + INSERT_VECTOR_ELT for X86 - If INSERT_VECTOR_ELT is supported (above SSE2, either by custom sequence of legal insn), transform BUILD_VECTOR into SHUFFLE + INSERT_VECTOR_ELT if most of elements could be built from SHUFFLE with few (so far 1) elements being inserted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166288 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 17:15:18 +00:00
Stepan Dyatkovskiy	0d3c8d5d16	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166273 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-19 08:23:06 +00:00
Sebastian Pop	83ba06afa8	Clear unknown mem ops when merging stack slots (pr14090) When merging stack slots, if StackColoring::remapInstructions gets a value back from GetUnderlyingObject that it does not know about or is not itself a stack slot, clear the memory operand in case it aliases the merged slot. This prevents the introduction of incorrect aliasing information. Author: Matthew Curtis <mcurtis@codeaurora.org> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166216 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-18 19:53:48 +00:00
Nadav Rotem	1c5bf3f429	In SimplifySelectOps we pulled two loads through a select node despite the fact that one was dependent on the other. rdar://12513091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166196 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-18 18:06:48 +00:00
Ulrich Weigand	6c28a7eec8	This patch fixes failures in the SingleSource/Regression/C/uint64_to_float test case on PowerPC caused by rounding errors when converting from a 64-bit integer to a single-precision floating point. The reason for this are double-rounding effects, since on PowerPC we have to convert to an intermediate double-precision value first, which gets rounded to the final single-precision result. The patch fixes the problem by preparing the 64-bit integer so that the first conversion step to double-precision will always be exact, and the final rounding step will result in the correctly-rounded single-precision result. The generated code sequence is equivalent to what GCC would generate. When -enable-unsafe-fp-math is in effect, that extra effort is omitted and we accept possible rounding errors (just like GCC does as well). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166178 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-18 13:16:11 +00:00
Michael Liao	07edaf3801	Revert part of r166049 back and enable test case in r166125. - Folding (trunc (concat ... X )) to (concat ... (trunc X) ...) is valid when '...' are all 'undef's. - r166125 relies on this transformation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166155 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 23:45:54 +00:00
Michael Liao	6e0c2b36d6	Disable extract-concat test case temporarily git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166141 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 23:08:19 +00:00
Michael Liao	4031e9018b	Revert r166049 - In general, it's unsafe for this transformation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166135 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 22:41:15 +00:00
Reed Kotler	95a2bb4cdf	Add conditional branch instructions and their patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166134 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 22:29:54 +00:00
Michael Liao	13429e224c	Teach DAG combine to fold (extract_subvec (concat v1, ..) i) to v_i - If the extracted vector has the same type of all vectored being concatenated together, it should be simplified directly into v_i, where i is the index of the element being extracted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166125 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 20:48:33 +00:00
Anton Korobeynikov	e4b33a115b	Fix fallout from RegInfo => FrameLowering refactoring on MSP430. Patch by Job Noorman! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166108 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 17:37:11 +00:00
Michael Liao	281ae5abf5	Fix setjmp on models with non-Small code model nor non-Static relocation model - MBB address is only valid as an immediate value in Small & Static code/relocation models. On other models, LEA is needed to load IP address of the restore MBB. - A minor fix of MBB in MC lowering is added as well to enable target relocation flag being propagated into MC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166084 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-17 02:22:27 +00:00
Jakob Stoklund Olesen	320db3f805	Avoid rematerializing a redef immediately after the old def. PR14098 contains an example where we would rematerialize a MOV8ri immediately after the original instruction: %vreg7:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 %vreg22:sub_8bit<def> = MOV8ri 9; GR32_ABCD:%vreg7 Besides being pointless, it is also wrong since the original instruction only redefines part of the register, and the value read by the new instruction is wrong. The problem was the LiveRangeEdit::allUsesAvailableAt() didn't special-case OrigIdx == UseIdx and found the wrong SSA value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166068 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 22:51:58 +00:00
Jakob Stoklund Olesen	cdcdfd2cab	Revert r166046 "Switch back to the old coalescer for now to fix the 32 bit bit" A fix for PR14098, including the test case is in the next commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166067 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 22:51:55 +00:00
Michael Liao	272ea03239	Teach DAG combine to fold (trunc (fptoXi x)) to (fptoXi x) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166049 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 19:38:35 +00:00
Rafael Espindola	6f7cccd2e2	Switch back to the old coalescer for now to fix the 32 bit bit llvm+clang+compiler-rt bootstrap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166046 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 19:34:06 +00:00
Bill Schmidt	7a6cb15a92	This patch addresses PR13949. For the PowerPC 64-bit ELF Linux ABI, aggregates of size less than 8 bytes are to be passed in the low-order bits ("right-adjusted") of the doubleword register or memory slot assigned to them. A previous patch addressed this for aggregates passed in registers. However, small aggregates passed in the overflow portion of the parameter save area are still being passed left-adjusted. The fix is made in PPCTargetLowering::LowerCall_Darwin_Or_64SVR4 on the caller side, and in PPCTargetLowering::LowerFormalArguments_64SVR4 on the callee side. The main fix on the callee side simply extends existing logic for 1- and 2-byte objects to 1- through 7-byte objects, and correcting a constant left over from 32-bit code. There is also a fix to a bogus calculation of the offset to the following argument in the parameter save area. On the caller side, again a constant left over from 32-bit code is fixed. Additionally, some code for 1, 2, and 4-byte objects is duplicated to handle the 3, 5, 6, and 7-byte objects for SVR4 only. The LowerCall_Darwin_Or_64SVR4 logic is getting fairly convoluted trying to handle both ABIs, and I propose to separate this into two functions in a future patch, at which time the duplication can be removed. The patch adds a new test (structsinmem.ll) to demonstrate correct passing of structures of all seven sizes. Eight dummy parameters are used to force these structures to be in the overflow portion of the parameter save area. As a side effect, this corrects the case when aggregates passed in registers are saved into the first eight doublewords of the parameter save area: Previously they were stored left-justified, and now are properly stored right-justified. This requires changing the expected output of existing test case structsinregs.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166022 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 13:30:53 +00:00
Stepan Dyatkovskiy	b52ba9f8a8	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166018 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 07:16:47 +00:00
NAKAMURA Takumi	e26874556b	Reapply r165661, Patch by Shuxin Yang <shuxin.llvm@gmail.com>. Original message: The attached is the fix to radar://11663049. The optimization can be outlined by following rules: (select (x != c), e, c) -> select (x != c), e, x), (select (x == c), c, e) -> select (x == c), x, e) where the <c> is an integer constant. The reason for this change is that : on x86, conditional-move-from-constant needs two instructions; however, conditional-move-from-register need only one instruction. While the LowerSELECT() sounds to be the most convenient place for this optimization, it turns out to be a bad place. The reason is that by replacing the constant <c> with a symbolic value, it obscure some instruction-combining opportunities which would otherwise be very easy to spot. For that reason, I have to postpone the change to last instruction-combining phase. The change passes the test of "make check-all -C <build-root/test" and "make -C project/test-suite/SingleSource". Original message since r165661: My previous change has a bug: I negated the condition code of a CMOV, and go ahead creating a new CMOV using the ORIGINAL condition code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166017 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 06:28:34 +00:00
Rafael Espindola	0cead1cfef	Fix the cpu name and add -verify-machineinstrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166003 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 01:13:06 +00:00
Andrew Trick	27c28cef11	misched: Added handleMove support for updating all kill flags, not just for allocatable regs. This is a medium term workaround until we have a more robust solution in the form of a register liveness utility for postRA passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@166001 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-16 00:22:51 +00:00
Michael Liao	6c0e04c823	Add __builtin_setjmp/_longjmp supprt in X86 backend - Besides used in SjLj exception handling, __builtin_setjmp/__longjmp is also used as a light-weight replacement of setjmp/longjmp which are used to implementation continuation, user-level threading, and etc. The support added in this patch ONLY addresses this usage and is NOT intended to support SjLj exception handling as zero-cost DWARF exception handling is used by default in X86. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165989 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-15 22:39:43 +00:00
Jim Grosbach	64ba635209	ARM: v1i64 and v2i64 VBSL intrinsic support. rdar://12502028 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@165981 91177308-0d34-0410-b5e6-96231b3b80d8	2012-10-15 21:23:40 +00:00

... 6 7 8 9 10 ...

8045 Commits