llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-07-25 14:30:32 +00:00

Author	SHA1	Message	Date
Colin LeMahieu	67451fe320	[Hexagon] Beginning converting intrinsics to patterns instead of duplicated definitions. Converting halfword multiply intrinsics. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226318 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 20:38:54 +00:00
Adam Nemet	ad2ac976af	[AVX512] Add intrinsics for masked aligned FP loads and stores Similar to the unaligned cases. Test was generated with update_llc_test_checks.py. Part of <rdar://problem/17688758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226296 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 18:50:09 +00:00
Adam Nemet	1310e4f3c7	[AVX512] Remove trailing whitespaces in this test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226295 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 18:50:07 +00:00
Duncan P. N. Exon Smith	fb7514fccb	IR: Allow 16-bits for column info Raise the limit for column information from 8 bits to 16 bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226291 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 17:33:08 +00:00
Andrea Di Biagio	ac7b9c828f	[X86][DAG] Disable target specific combine on INSERTPS dag nodes at -O0. This patch disables target specific combine on X86ISD::INSERTPS dag nodes if optlevel is CodeGenOpt::None. The backend currently implements a target specific combine rule that converts a vector load used by an INSERTPS dag node into a scalar load plus a scalar_to_vector. This allows ISel to select a single INSERTPSrm instead of two instructions (i.e. a vector load plus INSERTPSrr). However, the existing target combine rule on INSERTPS nodes only works under the assumption that ISel will always be able to match an INSERTPSrm. This is not true in general at -O0, since the backend only allows folding a load into the memory operand of an instruction if the optimization level is not CodeGenOpt::None. In the example below: // __m128 test(__m128 a, __m128 b) { __m128 c = _mm_insert_ps(a, b, 1 << 6); return c; } // Before this patch, at -O0, the backend would have canonicalized the load to 'b' into a scalar load plus scalar_to_vector. Later on, ISel would have selected an INSERTPSrr leaving the insertps mask in an inconsistent state: movss 4(%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # xmm0 = xmm1[1],xmm0[1,2,3]. With this patch, the backend avoids folding the vector load into the operand of the INSERTPS. The new codegen at -O0 is: movaps (%rdi), %xmm1 insertps $64, %xmm1, %xmm0 # %xmm1[1],xmm0[1,2,3]. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 14:55:26 +00:00
Simon Pilgrim	3717f7c80c	[X86] Refactored stack memory folding tests to explicitly force register spilling The current 'big vectors' stack folded reload testing pattern is very bulky and makes it difficult to test all instructions as big vectors will tend to use only the ymm instruction implementations. This patch changes the tests to use a nop call that lists explicit xmm registers as sideeffects, with this we can force a partial register spill of the relevant registers and then check that the reload is correctly folded. The asm generated only adds the forced spill, a nop instruction and a couple of extra labels (a fraction of the current approach). More exhaustive tests will follow shortly, I've added some extra tests (the xmm versions of some of the existing folding tests) as a starting point. Differential Revision: http://reviews.llvm.org/D6932 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226264 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 09:32:54 +00:00
Timur Iskhodzhanov	6a7c74de33	Revert r226242 - Revert Revert Don't create new comdats in CodeGen This breaks AddressSanitizer (ninja check-asan) on Windows git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226251 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 08:38:45 +00:00
Filipe Cabecinhas	293d3deea3	Use report_fatal_error instead of llvm_unreachable, so we don't crash on user input git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226248 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 04:54:12 +00:00
Hal Finkel	92cd0ca3b2	[PowerPC] Adjust PatchPoints for ppc64le Bill Schmidt pointed out that some adjustments would be needed to properly support powerpc64le (using the ELF V2 ABI). For one thing, R11 is not available as a scratch register, so we need to use R12. R12 is also available under ELF V1, so to maintain consistency, I flipped the order to make R12 the first scratch register in the array under both ABIs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226247 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 04:40:58 +00:00
Mehdi Amini	525f296ef1	Fix Reassociate handling of constant in presence of undef float http://reviews.llvm.org/D6993 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226245 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 03:00:58 +00:00
Rafael Espindola	dfe88a08c7	Revert "Revert Don't create new comdats in CodeGen" This reverts commit r226173, adding r226038 back. No change in this commit, but clang was changed to also produce trivial comdats for costructors, destructors and vtables when needed. Original message: Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226242 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 02:22:55 +00:00
Kevin Enderby	ce68afed3c	Work around to get the build bot clang-cmake-armv7-a15-full green by removing the macho-archive-headers.test added with r226228 that it is failing on for now while I try to figure out what is going on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226241 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 02:08:11 +00:00
Kevin Enderby	241bbdde37	Another attempt to fix the build bot clang-cmake-armv7-a15-full failing on the macho-archive-headers.test added with r226228. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226239 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 01:09:54 +00:00
Sanjoy Das	148e8c9b8b	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. This pass was originally r226201. It was reverted because it used C++ features not supported by MSVC 2012. Differential Revision: http://reviews.llvm.org/D6693 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226238 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-16 01:03:22 +00:00
Matt Arsenault	ab2315014e	R600/SI: Add patterns for v_cvt_{flr\|rpi}_i32_f32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226230 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:58:35 +00:00
Filipe Cabecinhas	909dd28c54	Fix edge case when Start overflowed in 32 bit mode git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226229 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:50:44 +00:00
Kevin Enderby	cdfe54f8a9	Add the option, -archive-headers, used with -macho to print the Mach-O archive headers to llvm-objdump. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226228 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:19:11 +00:00
Matt Arsenault	c204f47feb	R600/SI: Fix trailing comma with modifiers Instructions with 1 operand can still use source modifiers, so make sure we don't print an extra comma afterwards. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226226 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:17:03 +00:00
Colin LeMahieu	c93be748d7	[Hexagon] Adding new-value store and bit reverse instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226224 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 23:10:29 +00:00
Filipe Cabecinhas	0183477621	Report fatal errors instead of segfaulting/asserting on a few invalid accesses while reading MachO files. Summary: Shift an older “invalid file” test to get a consistent naming for these tests. Bugs found by afl-fuzz Reviewers: rafael Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6945 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226219 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 22:52:38 +00:00
Sanjoy Das	df1b4f601d	Revert r226201 (Add a new pass "inductive range check elimination") The change used C++11 features not supported by MSVC 2012. I will fix the change to use things supported MSVC 2012 and recommit shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226216 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 22:18:10 +00:00
Hal Finkel	94dc061e85	[PowerPC] Loosen ELFv1 PPC64 func descriptor loads for indirect calls Function pointers under PPC64 ELFv1 (which is used on PPC64/Linux on the POWER7, A2 and earlier cores) are really pointers to a function descriptor, a structure with three pointers: the actual pointer to the code to which to jump, the pointer to the TOC needed by the callee, and an environment pointer. We used to chain these loads, and make them opaque to the rest of the optimizer, so that they'd always occur directly before the call. This is not necessary, and in fact, highly suboptimal on embedded cores. Once the function pointer is known, the loads can be performed ahead of time; in fact, they can be hoisted out of loops. Now these function descriptors are almost always generated by the linker, and thus the contents of the descriptors are invariant. As a result, by default, we'll mark the associated loads as invariant (allowing them to be hoisted out of loops). I've added a target feature to turn this off, however, just in case someone needs that option (constructing an on-stack descriptor, casting it to a function pointer, and then calling it cannot be well-defined C/C++ code, but I can imagine some JIT-compilation system doing so). Consider this simple test: $ cat call.c typedef void (fp)(); void bar(fp x) { for (int i = 0; i < 1600000000; ++i) x(); } $ cat main.c typedef void (fp)(); void bar(fp x); void foo() {} int main() { bar(foo); } On the PPC A2 (the BG/Q supercomputer), marking the function-descriptor loads as invariant brings the execution time down to ~8 seconds from ~32 seconds with the loads in the loop. The difference on the POWER7 is smaller. Compiling with: gcc -std=c99 -O3 -mcpu=native call.c main.c : ~6 seconds [this is 4.8.2] clang -O3 -mcpu=native call.c main.c : ~5.3 seconds clang -O3 -mcpu=native call.c main.c -mno-invariant-function-descriptors : ~4 seconds (looks like we'd benefit from additional loop unrolling here, as a first guess, because this is faster with the extra loads) The -mno-invariant-function-descriptors will be added to Clang shortly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226207 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 21:17:34 +00:00
Colin LeMahieu	02b677594c	[Hexagon] Updating indexed load-extend patterns and changing test to new expected output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226206 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 21:07:52 +00:00
Sanjoy Das	0170a308ec	Add a new pass "inductive range check elimination" IRCE eliminates range checks of the form 0 <= A * I + B < Length by splitting a loop's iteration space into three segments in a way that the check is completely redundant in the middle segment. As an example, IRCE will convert len = < known positive > for (i = 0; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } to len = < known positive > limit = smin(n, len) // no first segment for (i = 0; i < limit; i++) { if (0 <= i && i < len) { // this check is fully redundant do_something(); } else { throw_out_of_bounds(); } } for (i = limit; i < n; i++) { if (0 <= i && i < len) { do_something(); } else { throw_out_of_bounds(); } } IRCE can deal with multiple range checks in the same loop (it takes the intersection of the ranges that will make each of them redundant individually). Currently IRCE does not do any profitability analysis. That is a TODO. Please note that the status of this pass is experimental, and it is not part of any default pass pipeline. Having said that, I will love to get feedback and general input from people interested in trying this out. Differential Revision: http://reviews.llvm.org/D6693 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226201 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 20:45:46 +00:00
Hal Finkel	39b09ae788	Revert "r226086 - Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers"" Reapply r226071 with fixes. Two fixes: 1. We need to manually remove the old and create the new 'deaf defs' associated with physical register definitions when we move the definition of the physical register from the copy point to the point of the original vreg def. This problem was picked up by the machinstr verifier, and could trigger a verification failure on test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll, so I've turned on the verifier in the tests. 2. When moving the def point of the phys reg up, we need to make sure that it is neither defined nor read in between the two instructions. We don't, however, extend the live ranges of phys reg defs to cover uses, so just checking for live-range overlap between the pair interval and the phys reg aliases won't pick up reads. As a result, we manually iterate over the range and check for reads. A test soon to be committed to the PowerPC backend will test this change. Original commit message: [RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226200 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 20:32:09 +00:00
Matt Arsenault	ecbec418bd	R600/SI: Improve fpext / fptrunc test coverage git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 19:39:42 +00:00
Colin LeMahieu	42fa763380	[Hexagon] Removing old versions of vsplice, valign, cl0, ct0 and updating references to new versions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226194 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 19:28:32 +00:00
Marek Olsak	232d5fa02c	R600/SI: Use 64-bit encoding by default for opcodes that are VOP3-only on VI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226190 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 18:43:01 +00:00
Colin LeMahieu	500b0d97a1	[Hexagon] Adding vmux instruction. Removing old transfer instructions and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226184 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 18:16:00 +00:00
Ramkumar Ramachandra	4f158a708b	statepoint tests: use statepoint-example gc Mechanical conversion of statepoint tests to use the example-statepoint gc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226183 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 18:10:44 +00:00
Joerg Sonnenberger	638077aa41	Support @PLT loads on 32bit x86. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226182 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 17:59:02 +00:00
Colin LeMahieu	044438aff5	[Hexagon] Deleting old float comparison instruction and updating references to new ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226179 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 17:28:14 +00:00
Colin LeMahieu	4ce3b1e4ce	[Hexagon] Replacing old fadd/fsub instructions and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226176 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 16:30:07 +00:00
Timur Iskhodzhanov	d048b3be70	Revert Don't create new comdats in CodeGen It breaks AddressSanitizer on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226173 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 16:14:34 +00:00
Daniel Sanders	cb71ef1b46	[mips] Fix a typo in the compare patterns for MIPS32r6/MIPS64r6. Summary: The patterns intended for the SETLE node were actually matching the SETLT node. Reviewers: atanasyan, sstankovic, vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6997 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 15:41:03 +00:00
Vladimir Medic	b6d562e480	Add disassembler tests for mips64r6 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226166 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 14:18:12 +00:00
Vladimir Medic	d83822e6d7	Add disassembler tests for mips32r6 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226165 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 14:11:38 +00:00
Vladimir Medic	e671c1cdb5	Add disassembler tests for mips64r2 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226164 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 14:06:34 +00:00
Chandler Carruth	e2ffd02ad3	[PM] Port TargetLibraryInfo to the new pass manager, provided by the TargetLibraryAnalysis pass. There are actually no direct tests of this already in the tree. I've added the most basic test that the pass manager bits themselves work, and the TLI object produced will be tested by an upcoming patches as they port passes which rely on TLI. This is starting to point out the awkwardness of the invalidate API -- it seems poorly fitting on the result object. I suspect I will change it to live on the analysis instead, but that's not for this change, and I'd rather have a few more passes ported in order to have more experience with how this plays out. I believe there is only one more analysis required in order to start porting instcombine. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226160 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 11:39:46 +00:00
Vladimir Medic	dc67d7678a	Add disassembler tests for mips64 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226151 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 08:50:20 +00:00
Hal Finkel	f908e37144	Revert "r226071 - [RegisterCoalescer] Remove copies to reserved registers" Reverting this while I investigate some bad behavior this is causing. As a possibly-related issue, adding -verify-machineinstrs to one of the test cases now fails because of this change: llc test/CodeGen/X86/2009-02-12-DebugInfoVLA.ll -march=x86-64 -o - -verify-machineinstrs * Bad machine code: No instruction at def index * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS Valno #3 is defined at 624r * Bad machine code: Live segment doesn't end at a valid instruction * - function: foo - basic block: BB#0 return (0x10007e21f10) [0B;736B) - liverange: [128r,128d:9)[160r,160d:8)[176r,176d:7)[336r,336d:6)[464r,464d:5)[480r,480d:4)[624r,624d:3)[752r,752d:2)[768r,768d:1)[78 4r,784d:0) 0@784r 1@768r 2@752r 3@624r 4@480r 5@464r 6@336r 7@176r 8@160r 9@128r - register: %DS [624r,624d:3) LLVM ERROR: Found 2 machine code errors. where 624r corresponds exactly to the interval combining change: 624B %RSP<def> = COPY %vreg16; GR64:%vreg16 Considering merging %vreg16 with %RSP RHS = %vreg16 [608r,624r:0) 0@608r updated: 608B %RSP<def> = MOV64rm <fi#3>, 1, %noreg, 0, %noreg; mem:LD8[%saved_stack.1] Success: %vreg16 -> %RSP Result = %RSP git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226086 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 03:08:59 +00:00
Sanjoy Das	7ec1829823	Fix PR22222 The bug was introduced in r225282. r225282 assumed that sub X, Y is the same as add X, -Y. This is not correct if we are going to upgrade the sub to sub nuw. This change fixes the issue by making the optimization ignore sub instructions. Differential Revision: http://reviews.llvm.org/D6979 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226075 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:46:09 +00:00
Hal Finkel	47ab8c106f	[RegisterCoalescer] Remove copies to reserved registers This allows the RegisterCoalescer to join "non-flipped" range pairs with a physical destination register -- which allows the RegisterCoalescer to remove copies like this: <vreg> = something (maybe a load, for example) ... (things that don't use PHYSREG) PHYSREG = COPY <vreg> (with all of the restrictions normally applied by the RegisterCoalescer: having compatible register classes, etc. ) Previously, the RegisterCoalescer handled only the opposite case (copying from a physical register). I don't handle the problem fully here, but try to get the common case where there is only one use of <vreg> (the COPY). An upcoming commit to the PowerPC backend will make this pattern much more common on PPC64/ELF systems. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226071 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:25:28 +00:00
Hal Finkel	4572a0a0a2	[PowerPC] Add assembler support for mcrfs and friends Fill out our support for the floating-point status and control register instructions (mcrfs and friends). As it turns out, these are necessary for compiling src/test/harness_fp.h in TBB for PowerPC. Thanks to Raf Schietekat for reporting the issue! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226070 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:00:53 +00:00
Richard Smith	ef7d38d35a	For PR21145: recognise a builtin call to a known deallocation function even if it's defined in the current module. Clang generates this situation for the C++14 sized deallocation functions, because it generates a weak definition in case one isn't provided by the C++ runtime library. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226069 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-15 01:00:33 +00:00
Ramkumar Ramachandra	fba4d82671	[GC] CodeGenPrep transform: simplify offsetable relocate The transform is somewhat involved, but the basic idea is simple: find derived pointers that have been offset from the base pointer using gep and replace the relocate of the derived pointer with a gep to the relocated base pointer (with the same offset). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226060 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 23:27:07 +00:00
Philip Reames	8f9d11309a	getMangledTypeStr: clarify how it mangles types, and add tests "Write a set of tests that show how name mangling is done for overloaded intrinsics." These happen to use gc.relocates to exercise the codepath in question, but is not a GC specific test. Patch by: artagnon@gmail.com Differential Revision: http://reviews.llvm.org/D6915 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226056 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 23:05:17 +00:00
Duncan P. N. Exon Smith	37ac8d3622	IR: Move MDLocation into place This commit moves `MDLocation`, finishing off PR21433. There's an accompanying clang commit for frontend testcases. I'll attach the testcase upgrade script I used to PR21433 to help out-of-tree frontends/backends. This changes the schema for `DebugLoc` and `DILocation` from: !{i32 3, i32 7, !7, !8} to: !MDLocation(line: 3, column: 7, scope: !7, inlinedAt: !8) Note that empty fields (line/column: 0 and inlinedAt: null) don't get printed by the assembly writer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226048 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 22:27:36 +00:00
Duncan P. N. Exon Smith	bf13dd03e7	IR: Always print MDLocation line Print `MDLocation`'s `line` field even when it's 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226046 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 22:14:26 +00:00
Rafael Espindola	33f5127540	Don't create new comdats in CodeGen. This patch stops the implicit creation of comdats during codegen. Clang now sets the comdat explicitly when it is required. With this patch clang and gcc now produce the same result in pr19848. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:55:48 +00:00
Rafael Espindola	a6b0d5a62e	Add a test that would have found the issue with r225644. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226035 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:24:46 +00:00
Chandler Carruth	8af8091ef5	[MBP] Add flags to disable the BadCFGConflict check in MachineBlockPlacement. Some benchmarks have shown that this could lead to a potential performance benefit, and so adding some flags to try to help measure the difference. A possible explanation. In diamond-shaped CFGs (A followed by either B or C both followed by D), putting B and C both in between A and D leads to the code being less dense than it could be. Always either B or C have to be skipped increasing the chance of cache misses etc. Moving either B or C to after D might be beneficial on average. In the long run, but we should probably do a better job of analyzing the basic block and branch probabilities to move the correct one of B or C to after D. But even if we don't use this in the long run, it is a good baseline for benchmarking. Original patch authored by Daniel Jasper with test tweaks and a second flag added by me. Differential Revision: http://reviews.llvm.org/D6969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226034 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:19:29 +00:00
Bill Schmidt	11abe69e98	[PPC64] Add support for the ICBT instruction on POWER8. Patch by Kit Barton. Support for the ICBT instruction is currently present, but limited to embedded processors. This change adds a new FeatureICBT that can be used to identify whether the ICBT instruction is available on a specific processor. Two new tests are added: * Positive test to ensure the icbt instruction is present when using -mcpu=pwr8 * Negative test to ensure the icbt instruction is not generated when using -mcpu=pwr7 Both test cases use the Prefetch opcode in LLVM. They are based on the ppc64-prefetch.ll test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226033 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:17:10 +00:00
Rafael Espindola	0a2caa143f	Fix linking of shared libraries. In shared libraries the plugin can see non-weak declarations that are still undefined. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226031 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 20:08:46 +00:00
Rafael Espindola	55c86a8cdc	Fix handling of extern_weak. This was broken by r225983. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226026 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 19:43:32 +00:00
David Majnemer	5e8cd99f55	InstCombine: Don't take A-B<0 into A<B if A-B has other uses This fixes PR22226. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226023 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 19:26:56 +00:00
Rafael Espindola	8327f0bca1	Revert "Add r224985 back with two fixes." This reverts commit r225644 while I debug a regression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226022 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 19:07:23 +00:00
Rafael Espindola	ad946a868f	Add support for comdats with names larger than 256 characters. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@226012 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 18:25:45 +00:00
Olivier Sallenave	735aa71398	Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225987 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 15:36:28 +00:00
Rafael Espindola	5f92811f30	Handle a symbol being undefined. This can happen if: * It is present in a comdat in one file. * It is not present in the comdat of the file that is kept. * Is is not used. This should fix the LTO boostrap. Thanks to Takumi NAKAMURA for setting up the bot! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225983 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 13:53:50 +00:00
Vladimir Medic	81e68c9023	Add disassembler tests for mips32r2 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225980 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 11:35:22 +00:00
Jyoti Allur	fd06dd8efc	Correct POP handling for v7m git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225972 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 10:48:16 +00:00
Chandler Carruth	8c3a02f8fe	[PM] Port domtree to the new pass manager (at last). This adds the domtree analysis to the new pass manager. The analysis returns the same DominatorTree result entity used by the old pass manager and essentially all of the code is shared. We just have different boilerplate for running and printing the analysis. I've converted one test to run in both modes just to make sure this is exercised while both are live in the tree. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225969 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 10:19:28 +00:00
Kai Nacke	92e28620d3	[mips] Refine octeon instructions seq/seqi/sne/snei This commit refines the pattern for the octeon seq/seqi/sne/snei instructions. The target register is set to 0 or 1 according to the result of the comparison. In C, this is something like rd = (unsigned long)(rs == rt) This commit adds a zext to bring the result to i64. With this change the instruction is selected for this type of code. (gcc produces the same code for the above C code.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225968 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 10:19:09 +00:00
Vladimir Medic	8da9819ca7	Add disassembler tests for mips32r2 platform. There are no functional changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225967 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 10:18:56 +00:00
Brad Smith	f449c53c89	Use the integrated assembler by default on SPARC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225957 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 07:53:39 +00:00
JF Bastien	7f0cbb5703	Revert "Insert random noops to increase security against ROP attacks (llvm)" This reverts commit: http://reviews.llvm.org/D3392 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225948 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 05:24:33 +00:00
Saleem Abdulrasool	1679d0d3c2	X86: validate 'int' instruction The int instruction takes as an operand an 8-bit immediate value. Validate that the input is valid rather than silently truncating the value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225941 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 05:10:21 +00:00
NAKAMURA Takumi	2a38522280	Disable a couple of tests, CodeGen/X86/noop-insert.ll and CodeGen/X86/noop-insert-percentage.ll, in r225908, to unbreak tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225940 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 04:21:33 +00:00
Chandler Carruth	d39ad60564	[dom] Add a basic dominator tree test. Correct, we have zero basic testing of the dominator tree in the regression test suite. There is a single test that even prints it out, and that test only checks a single line of the output. There are a handful of tests that check post dominators, but all of those are looking for bugs rather than just exercising the basic machinery. This test is super boring and unexciting. But hey, it's something. I needed there to be something so I could switch the basic test to run with both the old and new pass manager. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225936 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 03:34:55 +00:00
Tim Northover	09bec94c16	ARM: add test for crc32 instructions in CodeGen. Somehow we seem to have ended up without any actual tests of the CodeGen side. Easy enough to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225930 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:43:33 +00:00
Hal Finkel	037c21f82c	[PowerPC] Fix the noop-insert test The form of nops used is CPU-specific (some CPUs, such as the POWER7, have special group-terminating nops). We probably want a different callback for this kind of nop insertion (something more like MCAsmBackend::writeNopData), or for PPC to use a different mechanism for scheduling nops, but this will stop the test from failing for now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225928 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:37:21 +00:00
Matt Arsenault	140c2ece1e	R600/SI: Remove some redudant load testcases. This reduces coverage for Evergreen, since the more complete tests have those run lines disabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225927 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:26 +00:00
Matt Arsenault	781f7ee502	R600/SI: Fix bad code with unaligned byte vector loads Don't do the v4i8 -> v4f32 combine if the load will need to be expanded due to alignment. This stops adding instructions to repack into a single register that the v_cvt_ubyteN_f32 instructions read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225926 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:22 +00:00
Matt Arsenault	8b6a26ca85	Implement new way of expanding extloads. Now that the source and destination types can be specified, allow doing an expansion that doesn't use an EXTLOAD of the result type. Try to do a legal extload to an intermediate type and extend that if possible. This generalizes the special case custom lowering of extloads R600 has been using to work around this problem. This also happens to fix a bug that would incorrectly use more aligned loads than should be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225925 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:35:17 +00:00
Duncan P. N. Exon Smith	68ee48f92e	Utils: Handle remapping distinct MDLocations Part of PR21433. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225921 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:29:32 +00:00
Duncan P. N. Exon Smith	74195b2df3	Utils: Add mapping for uniqued MDLocations Still doesn't handle distinct ones. Part of PR21433. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225914 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:20:27 +00:00
Hal Finkel	ade705c6e5	Revert "r225811 - Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support"" This re-applies r225808, fixed to avoid problems with SDAG dependencies along with the preceding fix to ScheduleDAGSDNodes::RegDefIter::InitNodeNumDefs. These problems caused the original regression tests to assert/segfault on many (but not all) systems. Original commit message: This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225909 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:07:51 +00:00
JF Bastien	21befa7761	Insert random noops to increase security against ROP attacks (llvm) A pass that adds random noops to X86 binaries to introduce diversity with the goal of increasing security against most return-oriented programming attacks. Command line options: -noop-insertion // Enable noop insertion. -noop-insertion-percentage=X // X% of assembly instructions will have a noop prepended (default: 50%, requires -noop-insertion) -max-noops-per-instruction=X // Randomly generate X noops per instruction. ie. roll the dice X times with probability set above (default: 1). This doesn't guarantee X noop instructions. In addition, the following 'quick switch' in clang enables basic diversity using default settings (currently: noop insertion and schedule randomization; it is intended to be extended in the future). -fdiversify This is the llvm part of the patch. clang part: D3393 http://reviews.llvm.org/D3392 Patch by Stephen Crane (@rinon) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225908 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:07:26 +00:00
Reid Kleckner	504fa89c8e	CodeGen support for x86_64 SEH catch handlers in LLVM This adds handling for ExceptionHandling::MSVC, used by the x86_64-pc-windows-msvc triple. It assumes that filter functions have already been outlined in either the frontend or the backend. Filter functions are used in place of the landingpad catch clause type info operands. In catch clause order, the first filter to return true will catch the exception. The C specific handler table expects the landing pad to be split into one block per handler, but LLVM IR uses a single landing pad for all possible unwind actions. This patch papers over the mismatch by synthesizing single instruction BBs for every catch clause to fill in the EH selector that the landing pad block expects. Missing functionality: - Accessing data in the parent frame from outlined filters - Cleanups (from __finally) are unsupported, as they will require outlining and parent frame access - Filter clauses are unsupported, as there's no clear analogue in SEH In other words, this is the minimal set of changes needed to write IR to catch arbitrary exceptions and resume normal execution. Reviewers: majnemer Differential Revision: http://reviews.llvm.org/D6300 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225904 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 01:05:27 +00:00
Ahmed Bougacha	61d6dc41fa	[SimplifyLibCalls] Don't try to simplify indirect calls. It turns out, all callsites of the simplifier are guarded by a check for CallInst::getCalledFunction (i.e., to make sure the callee is direct). This check wasn't done when trying to further optimize a simplified fortified libcall, introduced by a refactoring in r225640. Fix that, add a testcase, and document the requirement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225895 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-14 00:55:05 +00:00
Adrian Prantl	57ed5ffc76	Debug Info: Move the complex expression handling (=the remainder) of emitDebugLocValue() into DwarfExpression. Ought to be NFC, but it actually uncovered a bug in the debug-loc-asan.ll testcase. The testcase checks that the address of variable "y" is stored at [RSP+16], which also lines up with the comment. It also check(ed) that the value of "y" is stored in RDI before that, but that is actually incorrect, since RDI is the very value that is stored in [RSP+16]. Here's the assembler output: movb 2147450880(%rcx), %r8b #DEBUG_VALUE: bar:y <- RDI cmpb $0, %r8b movq %rax, 32(%rsp) # 8-byte Spill movq %rsi, 24(%rsp) # 8-byte Spill movq %rdi, 16(%rsp) # 8-byte Spill .Ltmp3: #DEBUG_VALUE: bar:y <- [RSP+16] Fixed the comment to spell out the correct register and the check to expect an address rather than a value. Note that the range that is emitted for the RDI location was and is still wrong, it claims to begin at the function prologue, but really it should start where RDI is first assigned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225851 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 23:39:11 +00:00
Adam Nemet	656da67bc0	[AVX512] Add 16x32 unpck tests as well Forgot this from r225838. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 23:27:55 +00:00
Chandler Carruth	a28a251950	[PM] Remove the defunt CGSCC-specific debug flag. Even before I sunk the debug flag into the opt tool this had been made obsolete by factoring the pass and analysis managers into a single set of templates that all used the core flag. No functionality changed here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225842 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 22:45:13 +00:00
Adam Nemet	f38f71d8a0	Fix function names in tests from r225838. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225840 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 22:40:15 +00:00
Adam Nemet	293f71ddd2	[AVX512] Unpack support in new shuffle lowering This now handles both 32 and 64-bit element sizes. In this version, the test are in vector-shuffle-512-v8.ll, canonicalized by Chandler's update_llc_test_checks.py. Part of <rdar://problem/17688758> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225838 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 22:20:18 +00:00
Duncan P. N. Exon Smith	3b0fe4ec0a	AsmParser/Bitcode: Add support for MDLocation This adds assembly and bitcode support for `MDLocation`. The assembly side is rather big, since this is the first `MDNode` subclass (that isn't `MDTuple`). Part of PR21433. (If you're wondering where the mountains of testcase updates are, we don't need them until I update `DILocation` and `DebugLoc` to actually use this class.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225830 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 21:10:44 +00:00
Matt Arsenault	8603a3d1c5	R600: Implement getRsqrtEstimate Only do for f32 since I'm unclear on both what this is expecting for the refinement steps in terms of accuracy, and what f64 instruction actually provides. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225827 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 20:53:18 +00:00
Matt Arsenault	9e495c518c	R600: Make cttz / ctlz cheap to speculate Speculating things is generally good. SI+ has instructions for these for 32-bit values. This is still probably better even with the expansion for 64-bit values, although it is odd that this callback doesn't have the size as a parameter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225822 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:46:48 +00:00
Ulrich Weigand	81d2500685	Use the integrated assembler as default on SystemZ This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases deliberately using an invalid instruction in inline asm now have to use -no-integrated-as. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225820 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:45:16 +00:00
Ulrich Weigand	5a4c26e7bc	Use the integrated assembler as default on PowerPC This was already done in clang, this commit now uses the integrated assembler as default when using LLVM tools directly. A number of test cases using inline asm had to be adapted, either by updating the expected output, or by using -no-integrated-as (for such tests that deliberately use an invalid instruction in inline asm). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225819 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 19:43:45 +00:00
Hal Finkel	ea55eceaed	Revert "r225808 - [PowerPC] Add StackMap/PatchPoint support" Reverting this while I investiage buildbot failures (segfaulting in GetCostForDef at ScheduleDAGRRList.cpp:314). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225811 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 18:25:05 +00:00
Will Schmidt	3d6977a65a	Update multiline.ll testcase to handle (ppc64le) .localentry directive The ppc64le platform will emit a .localentry directive. This is triggering a false-positive against a CHECK-NOT: .loc in multiline.ll. Add a space "{{ }}" to the check-not line to allow for arguments, and prevent .localentry from matching. Differential Revision: http://reviews.llvm.org/D6935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225810 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 18:17:08 +00:00
Hal Finkel	232f393466	[PowerPC] Add StackMap/PatchPoint support This commit does two things: 1. Refactors PPCFastISel to use more of the common infrastructure for call lowering (this lets us take advantage of this common code for lowering some common intrinsics, stackmap/patchpoint among them). 2. Adds support for stackmap/patchpoint lowering. For the most part, this is very similar to the support in the AArch64 target, with the obvious differences (different registers, NOP instructions, etc.). The test cases are adapted from the AArch64 test cases. One difference of note is that the patchpoint call sequence takes 24 bytes, so you can't use less than that (on AArch64 you can go down to 16). Also, as noted in the docs, we take the patchpoint address to be the actual code address (assuming the call is local in the TOC-sharing sense), which should yield higher performance than generating the full cross-DSO indirect-call sequence and is likely just as useful for JITed code (if not, we'll change it). StackMaps and Patchpoints are still marked as experimental, and so this support is doubly experimental. So go ahead and experiment! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225808 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 17:48:12 +00:00
Jozef Kolek	abdc0284ff	[mips][microMIPS] Fix issue with 16b instructions in jr instruction delay slot 16 bit instructions are not allowed in jr delay slot. Same stands for PseudoIndirectBranch and PseudoReturn. Differential Revision: http://reviews.llvm.org/D6815 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225798 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 15:59:17 +00:00
Chandler Carruth	92602ea18e	[PM] Refactor the new pass manager to use a single template to implement the generic functionality of the pass managers themselves. In the new infrastructure, the pass "manager" isn't actually interesting at all. It just pipelines a single chunk of IR through N passes. We don't need to know anything about the IR or the passes to do this really and we can replace the 3 implementations of the exact same functionality with a single generic PassManager template, complementing the single generic AnalysisManager template. I've left typedefs in place to give convenient names to the various obvious instantiations of the template. With this, I think I've nuked almost all of the redundant logic in the managers, and I think the overall design is actually simpler for having single templates that clearly indicate there is no special logic here. The logging is made somewhat more annoying by this change, but I don't think the difference is worth having heavy-weight traits to help log things. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225783 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 11:13:56 +00:00
Chandler Carruth	6b1894aeae	[PM] Fold all three analysis managers into a single AnalysisManager template. This consolidates three copies of nearly the same core logic. It adds "complexity" to the ModuleAnalysisManager in that it makes it possible to share a ModuleAnalysisManager across multiple modules... But it does so by deleting all of the code, so I'm OK with that. This will naturally make fixing bugs in this code much simpler, etc. The only down side here is that we have to use 'typename' and 'this->' in various places, and the implementation is lifted into the header. I'll take that for the code size reduction. The convenient names are still typedef-ed and used throughout so that users can largely ignore this aspect of the implementation. The follow-up change to this will do the exact same refactoring for the PassManagers. =D It turns out that the interesting different code is almost entirely in the adaptors. At the end, that should be essentially all that is left. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225757 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 02:51:47 +00:00
Reid Kleckner	d8f69a7201	Rename llvm.recoverframeallocation to llvm.framerecover This name is less descriptive, but it sort of puts things in the 'llvm.frame...' namespace, relating it to frameallocate and frameaddress. It also avoids using "allocate" and "allocation" together. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225752 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 01:51:34 +00:00
Reid Kleckner	221a7075cf	Add the llvm.frameallocate and llvm.recoverframeallocation intrinsics These intrinsics allow multiple functions to share a single stack allocation from one function's call frame. The function with the allocation may only perform one allocation, and it must be in the entry block. Functions accessing the allocation call llvm.recoverframeallocation with the function whose frame they are accessing and a frame pointer from an active call frame of that function. These intrinsics are very difficult to inline correctly, so the intention is that they be introduced rarely, or at least very late during EH preparation. Reviewers: echristo, andrew.w.kaylor Differential Revision: http://reviews.llvm.org/D6493 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225746 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 00:48:10 +00:00
Matt Arsenault	29ad7506e1	Combine fcmp + select to fminnum / fmaxnum if no nans and legal Also require unsafe FP math for no since there isn't a way to test for signed zeros. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225744 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-13 00:43:00 +00:00
Reid Kleckner	1ec250a32f	musttail: Only set the inreg flag for fastcall and vectorcall Otherwise we'll attempt to forward ECX, EDX, and EAX for cdecl and stdcall thunks, leaving us with no scratch registers for indirect call targets. Fixes PR22052. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225729 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 23:28:23 +00:00
Adrian Prantl	f89325d832	Debug info: Factor out the creation of DWARF expressions from AsmPrinter into a new class DwarfExpression that can be shared between AsmPrinter and DwarfUnit. This is the first step towards unifying the two entirely redundant implementations of dwarf expression emission in DwarfUnit and AsmPrinter. Almost no functional change — Testcases were updated because asm comments that used to be on two lines now appear on the same line, which is actually preferable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225706 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 22:19:22 +00:00
Ahmed Bougacha	cd5bbd8bad	[X86] Also create+widen FMIN/FMAX nodes for v2f32. This happens in the HINT benchmark, where the SLP-vectorizer created v2f32 fcmp/select code. The "correct" solution would have been to teach the vectorizer cost model that v2f32 isn't legal (because really, it isn't), but if we can vectorize we might as well do so. We legalize these v2f32 FMIN/FMAX nodes by widening to v4f32 later on. v3f32 were already widened to v4f32 by the generic unroll-and-build-vector legalization. rdar://15763436 Differential Revision: http://reviews.llvm.org/D6557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225691 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 20:31:30 +00:00
Ahmed Bougacha	5316023a4e	[X86] Make SSE min/max testcases more explicit. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225687 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 20:15:47 +00:00
Tom Stellard	d275e025d2	R600/SI: Use RegisterOperands to specify which operands can accept immediates There are some operands which can take either immediates or registers and we were previously using different register class to distinguish between operands that could take immediates and those that could not. This patch switches to using RegisterOperands which should simplify the backend by reducing the number of register classes and also make it easier to implement the assembler. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225662 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 19:33:18 +00:00
Sanjay Patel	2211d38267	GVN: propagate equalities for floating point compares Allow optimizations based on FP comparison values in the same way as integers. This resolves PR17713: http://llvm.org/bugs/show_bug.cgi?id=17713 Differential Revision: http://reviews.llvm.org/D6911 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225660 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 19:29:48 +00:00
Rafael Espindola	5512415ade	Add r224985 back with two fixes. One is that AArch64 has additional restrictions on when local relocations can be used. We have to take those into consideration when deciding to put a L symbol in the symbol table or not. The other is that ld64 requires the relocations to cstring to use linker visible symbols on AArch64. Thanks to Michael Zolotukhin for testing this! Remove doesSectionRequireSymbols. In an assembly expression like bar: .long L0 + 1 the intended semantics is that bar will contain a pointer one byte past L0. In sections that are merged by content (strings, 4 byte constants, etc), a single position in the section doesn't give the linker enough information. For example, it would not be able to tell a relocation must point to the end of a string, since that would look just like the start of the next. The solution used in ELF to use relocation with symbols if there is a non-zero addend. In MachO before this patch we would just keep all symbols in some sections. This would miss some cases (only cstrings on x86_64 were implemented) and was inefficient since most relocations have an addend of 0 and can be represented without the symbol. This patch implements the non-zero addend logic for MachO too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225644 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 18:13:07 +00:00
Jozef Kolek	ad017096fc	[mips][microMIPS] Implement BEQZ16 and BNEZ16 instructions Differential Revision: http://reviews.llvm.org/D5271 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225627 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 12:03:34 +00:00
Richard Smith	7a95c03b1d	Put this test's input in the Inputs directory where it belongs, rather than reusing a file from a different test directory. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225621 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 08:50:47 +00:00
Hal Finkel	b6bb7db62b	[PowerPC] Fix calls to non-function objects Looking at r225438 inspired me to see how the PowerPC backend handled the situation (calling a bitcasted TLS global), and it turns out we also produced an error (cannot select ...). What it means to "call" something that is not a function is implementation and platform specific, but in the name of doing something (besides crashing), this makes sure we do what GCC does (treat all such calls as calls through a function pointer -- meaning that the pointer is assumed, as is the convention on PPC, to point to a function descriptor structure holding the actual code address along with the function's TOC pointer and environment pointer). As GCC does, we now do the same for calling regular (non-TLS) non-function globals too. I'm not sure whether this is the most useful way to define the behavior, but at least we won't be alone. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225617 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-12 04:34:47 +00:00
David Majnemer	85a0cb9bf2	Revert most of r225597 We can't rely on a DataLayout enlightened constant folder. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225599 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 07:29:51 +00:00
David Majnemer	d2f4460ee7	X86: Properly decode shuffle masks when the constant pool type is weird It's possible for the constant pool entry for the shuffle mask to come from a completely different operation. This occurs when Constants have the same bit pattern but have different types. Make DecodePSHUFBMask tolerant of types which, after a bitcast, are appropriately sized vector types. This fixes PR22188. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225597 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 05:08:57 +00:00
Saleem Abdulrasool	776673ea09	X86: teach X86TargetLowering about L,M,O constraints Teach the ISelLowering for X86 about the L,M,O target specific constraints. Although, for the moment, clang performs constraint validation and prevents passing along inline asm which may have immediate constant constraints violated, the backend should be able to cope with the invalid inline asm a bit better. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225596 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 04:39:24 +00:00
Saleem Abdulrasool	5e3c87ee1a	ARM: add support for segment base relocations (SBREL) This adds support for parsing and emitting the SBREL relocation variant for the ARM target. Handling this relocation variant is necessary for supporting the full ARM ELF specification. Addresses PR22128. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225595 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 04:39:18 +00:00
Chandler Carruth	561088eb5d	[x86] Remove some windows line endings that snuck into the tests here. Folks on Windows, remember to set up your subversion to strip these when submitting... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225593 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-11 01:36:20 +00:00
Sanjoy Das	7f0da20b97	Fix PR22179. We were incorrectly inferring nsw for certain SCEVs. We can be more aggressive here (see Richard Smith's comment on http://llvm.org/bugs/show_bug.cgi?id=22179) but this change just focuses on correctness. Differential Revision: http://reviews.llvm.org/D6914 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225591 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 23:41:24 +00:00
Simon Pilgrim	47abf0e3da	[X86][SSE] Improved (v)insertps shuffle matching In the current code we only attempt to match against insertps if we have exactly one element from the second input vector, irrespective of how much of the shuffle result is zeroable. This patch checks to see if there is a single non-zeroable element from either input that requires insertion. It also supports matching of cases where only one of the inputs need to be referenced. We also split insertps shuffle matching off into a new lowerVectorShuffleAsInsertPS function. Differential Revision: http://reviews.llvm.org/D6879 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225589 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 19:45:33 +00:00
Hal Finkel	9ae5b7a40a	[PowerPC] Mark zext of a small scalar load as free This initial implementation of PPCTargetLowering::isZExtFree marks as free zexts of small scalar loads (that are not sign-extending). This callback is used by SelectionDAGBuilder's RegsForValue::getCopyToRegs, and thus to determine whether a zext or an anyext is used to lower illegally-typed PHIs. Because later truncates of zero-extended values are nops, this allows for the elimination of later unnecessary truncations. Fixes the initial complaint associated with PR22120. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225584 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 08:21:59 +00:00
Saleem Abdulrasool	9c4081dce4	tests: fix previous commit The previous commit accidentally missed changes to the test output checking, resulting in an errant failure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225577 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 02:53:25 +00:00
Saleem Abdulrasool	e83fb8c282	test: merge ARM relocations test There is a fair number of relocations that are part of the AAELF specification. Simply merge the tests into a single test file, otherwise, we will end up with far too many test files to test each relocation type. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225576 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 02:48:29 +00:00
Saleem Abdulrasool	cc597901e5	tests: convert a couple of ARM relocation tests to readobj These tests are checking the relocation generation. Use the readobj output as it is much easier to follow when glancing over the tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225575 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 02:48:25 +00:00
Justin Hibbits	1c6936f6d7	Fully fix Bug #22115 . Summary: In the previous commit, the register was saved, but space was not allocated. This resulted in the parameter save area potentially clobbering r30, leading to nasty results. Test Plan: Tests updated Reviewers: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6906 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225573 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 01:57:21 +00:00
Hal Finkel	6829815d96	[PowerPC] Readjust the loop unrolling threshold Now that the way that the partial unrolling threshold for small loops is used to compute the unrolling factor as been corrected, a slightly smaller threshold is preferable. This is expected; other targets may need to re-tune as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225566 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 00:31:10 +00:00
Hal Finkel	a14d6f1ea5	[LoopUnroll] Fix the partial unrolling threshold for small loop sizes When we compute the size of a loop, we include the branch on the backedge and the comparison feeding the conditional branch. Under normal circumstances, these don't get replicated with the rest of the loop body when we unroll. This led to the somewhat surprising behavior that really small loops would not get unrolled enough -- they could be unrolled more and the resulting loop would be below the threshold, because we were assuming they'd take (LoopSize * UnrollingFactor) instructions after unrolling, instead of (((LoopSize-2) * UnrollingFactor)+2) instructions. This fixes that computation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225565 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 00:30:55 +00:00
Rafael Espindola	68016e0a6e	Use the DiagnosticHandler to print diagnostics when reading bitcode. The bitcode reading interface used std::error_code to report an error to the callers and it is the callers job to print diagnostics. This is not ideal for error handling or diagnostic reporting: * For error handling, all that the callers care about is 3 possibilities: * It worked * The bitcode file is corrupted/invalid. * The file is not bitcode at all. * For diagnostic, it is user friendly to include far more information about the invalid case so the user can find out what is wrong with the bitcode file. This comes up, for example, when a developer introduces a bug while extending the format. The compromise we had was to have a lot of error codes. With this patch we use the DiagnosticHandler to communicate with the human and std::error_code to communicate with the caller. This allows us to have far fewer error codes and adds the infrastructure to print better diagnostics. This is so because the diagnostics are printed when he issue is found. The code that detected the problem in alive in the stack and can pass down as much context as needed. As an example the patch updates test/Bitcode/invalid.ll. Using a DiagnosticHandler also moves the fatal/non-fatal error decision to the caller. A simple one like llvm-dis can just use fatal errors. The gold plugin needs a bit more complex treatment because of being passed non-bitcode files. An hypothetical interactive tool would make all bitcode errors non-fatal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225562 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-10 00:07:30 +00:00
Alexey Samsonov	43cc8a5fd1	Disable Go bindings test under UBSan. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225557 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 23:17:23 +00:00
Andrew Kaylor	e17e33b29f	Fix the JIT event listeners and replace the associated tests. The changes to EventListenerCommon.h were contributed by Arch Robison. This fixes bug 22095. http://reviews.llvm.org/D6905 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225554 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 22:53:24 +00:00
Hans Wennborg	ca71be6415	SimplifyCFG: check uses of constant-foldable instrs in switch destinations (PR20210) The previous code assumed that such instructions could not have any uses outside CaseDest, with the motivation that the instruction could not dominate CommonDest because CommonDest has phi nodes in it. That simply isn't true; e.g., CommonDest could have an edge back to itself. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225552 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 22:13:31 +00:00
Simon Pilgrim	34630b6ea9	[X86][SSE] Avoid vector byte shuffles with zero by using pshufb to create zeros pshufb can shuffle in zero bytes as well as bytes from a source vector - we can use this to avoid having to shuffle 2 vectors and ORing the result when the used inputs from a vector are all zeroable. Differential Revision: http://reviews.llvm.org/D6878 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225551 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 22:03:19 +00:00
Rafael Espindola	64fe6cce3e	Add a testcase of llvm-lto error handling. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225545 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 20:55:09 +00:00
Kevin Enderby	6248d1b153	Add the option, -universal-headers, used with -macho to print the Mach-O universal headers to llvm-objdump. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225537 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 19:22:37 +00:00
Tim Northover	8cd39a2630	Re-reapply r221924: "[GVN] Perform Scalar PRE on gep indices that feed loads before doing Load PRE" It's not really expected to stick around, last time it provoked a weird LTO build failure that I can't reproduce now, and the bot logs are long gone. I'll re-revert it if the failures recur. Original description: Perform Scalar PRE on gep indices that feed loads before doing Load PRE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225536 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 19:19:56 +00:00
Daniel Sanders	8d7b0bdcf0	[mips] Add support for accessing $gp as a named register. Summary: Mips Linux uses $gp to hold a pointer to thread info structure and accesses it with a named register. This makes this work for LLVM. The N32 ABI doesn't quite work yet since the frontend generates incorrect IR for this case. It neglects to truncate the 64-bit GPR to a 32-bit value before converting to a pointer. Given correct IR (as in the testcase in this patch), it works correctly. Reviewers: sstankovic, vmedic, atanasyan Reviewed By: atanasyan Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225529 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 17:21:30 +00:00
Hal Finkel	139bfee84c	[PowerPC] Enable late partial unrolling on the POWER7 The P7 benefits from not have really-small loops so that we either have multiple dispatch groups in the loop and/or the ability to form more-full dispatch groups during scheduling. Setting the partial unrolling threshold to 44 seems good, empirically, for the P7. Compared to using no late partial unrolling, this yields the following test-suite speedups: SingleSource/Benchmarks/Adobe-C++/simple_types_constant_folding -66.3253% +/- 24.1975% SingleSource/Benchmarks/Misc-C++/oopack_v1p8 -44.0169% +/- 29.4881% SingleSource/Benchmarks/Misc/pi -27.8351% +/- 12.2712% SingleSource/Benchmarks/Stanford/Bubblesort -30.9898% +/- 22.4647% I've speculatively added a similar setting for the P8. Also, I've noticed that the unroller does not quite calculate the unrolling factor correctly for really tiny loops because it neglects to account for the fact that not every loop body replicant contains an ending branch and counter increment. I'll fix that later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225522 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 15:51:16 +00:00
Saleem Abdulrasool	c2a1df7125	ARM: add support for R_ARM_ABS16 Add support for R_ARM_ABS16 relocation mapping. Addresses PR22156. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225510 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 06:57:24 +00:00
Saleem Abdulrasool	466a7dea9b	test: add additional test for SVN r225507 Add an additional test case to ensure that we generate the relocation even if the thumb target is used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225509 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 06:57:18 +00:00
Saleem Abdulrasool	ea4fe48b22	ARM: add support for R_ARM_ABS8 relocations Add support for R_ARM_ABS8 relocation. Addresses PR22126. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225507 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 05:59:12 +00:00
Matthias Braun	c41acffe22	RegisterCoalescer: Fix removeCopyByCommutingDef with subreg liveness The code that eliminated additional coalescable copies in removeCopyByCommutingDef() used MergeValueNumberInto() which internally may merge A into B or B into A. In this case A and B had different Def points, so we have to reset ValNo.Def to the intended one after merging. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225503 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 03:01:31 +00:00
Hal Finkel	4e98296890	[PowerPC] Fold [sz]ext with fp_to_int lowering where possible On modern cores with lfiw[az]x, we can fold a sign or zero extension from i32 to i64 into the load necessary for an i64 -> fp conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225493 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-09 01:34:30 +00:00
Duncan P. N. Exon Smith	3408708548	Utils: Keep distinct MDNodes distinct in MapMetadata() Create new copies of distinct `MDNode`s instead of following the uniquing `MDNode` logic. Just like self-references (or other cycles), `MapMetadata()` creates a new node. In practice most calls use `RF_NoModuleLevelChanges`, in which case nothing is duplicated anyway. Part of PR22111. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225476 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:42:30 +00:00
Duncan P. N. Exon Smith	f416d72973	IR: Add 'distinct' MDNodes to bitcode and assembly Propagate whether `MDNode`s are 'distinct' through the other types of IR (assembly and bitcode). This adds the `distinct` keyword to assembly. Currently, no one actually calls `MDNode::getDistinct()`, so these nodes only get created for: - self-references, which are never uniqued, and - nodes whose operands are replaced that hit a uniquing collision. The concept of distinct nodes is still not quite first-class, since distinct-ness doesn't yet survive across `MapMetadata()`. Part of PR22111. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225474 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:38:29 +00:00
Hal Finkel	b7c01bf403	[PowerPC] Mark all instructions as non-cheap for MachineLICM MachineLICM uses a callback named hasLowDefLatency to determine if an instruction def operand has a 'low' latency. If all relevant operands have a 'low' latency, the instruction is considered too cheap to hoist out of loops even in low-register-pressure situations. On PowerPC cores, both the embedded cores and the others, there is no reason to believe that this is a good choice: all instructions have a cost inside a loop, and hoisting them when not limited by register pressure is a reasonable default. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225471 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 22:11:49 +00:00
Akira Hatanaka	40cd57eb5c	[ARM] Fix a bug in constant island pass that was triggering an assertion. The assert was being triggered when the distance between a constant pool entry and its user exceeded the maximally allowed distance after thumb2 branch shortening. A padding was inserted after a thumb2 branch instruction was shrunk, which caused the user to be out of range. This is wrong as the padding should have been inserted by the layout algorithm so that the distance between two instructions doesn't grow later during thumb2 instruction optimization. This commit fixes the code in ARMConstantIslands::createNewWater to call computeBlockSize and set BasicBlock::Unalign when a branch instruction is inserted to create new water after a basic block. A non-zero Unalign causes the worst-case padding to be inserted when adjustBBOffsetsAfter is called to recompute the basic block offsets. rdar://problem/19130476 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225467 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:44:50 +00:00
Matt Arsenault	3b1f741856	Fix fcmp + fabs instcombines when using the intrinsic This was only handling the libcall. This is another example of why only the intrinsic should ever be used when it exists. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225465 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 20:09:34 +00:00
Lang Hames	1b3d915de6	[MCJIT] Remove a few redundant MCJIT tests, and drop the extraneous datalayout strings from the copies that remain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225460 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 18:52:15 +00:00
Rafael Espindola	8aab70ebfe	Make this test a bit stricter. It now checks for the end of the line or the opening '{'. While at it, remove empty comments. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225451 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 16:11:18 +00:00
Justin Hibbits	77e85a150c	Add saving and restoring of r30 to the prologue and epilogue, respectively Summary: The PIC additions didn't update the prologue and epilogue code to save and restore r30 (PIC base register). This does that. Test Plan: Tests updated. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225450 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:47:19 +00:00
Kristof Beyls	d1cee9b3bc	Fix large stack alignment codegen for ARM and Thumb2 targets This partially fixes PR13007 (ARM CodeGen fails with large stack alignment): for ARM and Thumb2 targets, but not for Thumb1, as it seems stack alignment for Thumb1 targets hasn't been supported at all. Producing an aligned stack pointer is done by zero-ing out the lower bits of the stack pointer. The BIC instruction was used for this. However, the immediate field of the BIC instruction only allows to encode an immediate that can zero out up to a maximum of the 8 lower bits. When a larger alignment is requested, a BIC instruction cannot be used; llvm was silently producing incorrect code in this case. This commit fixes code generation for large stack aligments by using the BFC instruction instead, when the BFC instruction is available. When not, it uses 2 instructions: a right shift, followed by a left shift to zero out the lower bits. The lowering of ARM::Int_eh_sjlj_dispatchsetup still has code that unconditionally uses BIC to realign the stack pointer, so it very likely has the same problem. However, I wasn't able to produce a test case for that. This commit adds an assert so that the compiler will fail the assert instead of silently generating wrong code if this is ever reached. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225446 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:09:14 +00:00
Tom Stellard	9a6e4f08fe	R600/SI: Remove SIISelLowering::legalizeOperands() Its functionality has been replaced by calling SIInstrInfo::legalizeOperands() from SIISelLowering::AdjstInstrPostInstrSelection() and running the SIFoldOperands and SIShrinkInstructions passes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225445 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 15:08:17 +00:00
Elena Demikhovsky	6e8b53da17	Masked Load/Store - fixed a bug in type legalization. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225441 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:29:19 +00:00
Michael Kuperstein	477eba5f81	Fix a think-o in the test for r225438. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225440 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 12:05:02 +00:00
Michael Kuperstein	0858c28ca8	[X86] Don't try to generate direct calls to TLS globals The call lowering assumes that if the callee is a global, we want to emit a direct call. This is correct for regular globals, but not for TLS ones. Differential Revision: http://reviews.llvm.org/D6862 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225438 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 11:50:58 +00:00
Craig Topper	cb964a5c58	Fix test case I missed in r225432. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225434 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 07:57:27 +00:00
Craig Topper	367b67df3e	[X86] Don't print 'dword ptr' or 'qword ptr' on the operand to some of the LEA variants in Intel syntax. The memory operand is inherently unsized. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225432 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 07:41:30 +00:00
Adrian Prantl	7e44a65e6b	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." This reverts commit r225379 while investigating an assertion failure reported by Alexey. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225424 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 02:02:00 +00:00
Quentin Colombet	9d60e0ff0a	[RegAllocGreedy] Introduce a late pass to repair broken hints. A broken hint is a copy where both ends are assigned different colors. When a variable gets evicted in the neighborhood of such copies, it is likely we can reconcile some of them. Context Copies are inserted during the register allocation via splitting. These split points are required to relax the constraints on the allocation problem. When such a point is inserted, both ends of the copy would not share the same color with respect to the current allocation problem. When variables get evicted, the allocation problem becomes different and some split point may not be required anymore. However, the related variables may already have been colored. This usually shows up in the assembly with pattern like this: def A ... save A to B def A use A restore A from B ... use B Whereas we could simply have done: def B ... def A use A ... use B Proposed Solution A variable having a broken hint is marked for late recoloring if and only if selecting a register for it evict another variable. Indeed, if no eviction happens this is pointless to look for recoloring opportunities as it means the situation was the same as the initial allocation problem where we had to break the hint. Finally, when everything has been allocated, we look for recoloring opportunities for all the identified candidates. The recoloring is performed very late to rely on accurate copy cost (all involved variables are allocated). The recoloring is simple unlike the last change recoloring. It propagates the color of the broken hint to all its copy-related variables. If the color is available for them, the recoloring uses it, otherwise it gives up on that hint even if a more complex coloring would have worked. The recoloring happens only if it is profitable. The profitability is evaluated using the expected frequency of the copies of the currently recolored variable with a) its current color and b) with the target color. If a) is greater or equal than b), then it is profitable and the recoloring happen. Example Consider the following example: BB1: a = b = BB2: ... = b = a Let us assume b gets split: BB1: a = b = BB2: c = b ... d = c = d = a Because of how the allocation work, b, c, and d may be assigned different colors. Now, if a gets evicted to make room for c, assuming b and d were assigned to something different than a. We end up with: BB1: a = st a, SpillSlot b = BB2: c = b ... d = c = d e = ld SpillSlot = e This is likely that we can assign the same register for b, c, and d, getting rid of 2 copies. Performances Both ARM64 and x86_64 show performance improvements of up to 3% for the llvm-testsuite + externals with Os and O3. There are a few regressions too that comes from the (in)accuracy of the block frequency estimate. <rdar://problem/18312047> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225422 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-08 01:16:39 +00:00
Matthias Braun	a065cf13cd	RegisterCoalescer: Fix valuesIdentical() in some subrange merge cases. I got confused and assumed SrcIdx/DstIdx of the CoalescerPair is a subregister index in SrcReg/DstReg, but they are actually subregister indices of the coalesced register that get you back to SrcReg/DstReg when applied. Fixed the bug, improved comments and simplified code accordingly. Testcase by Tom Stellard! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225415 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 23:58:38 +00:00
Philip Reames	a7f8f932a6	[GC] improve testing around gc.relocate and fix a test Patch by: Ramkumar Ramachandra <artagnon@gmail.com> "This patch started out as an exploration of gc.relocate, and an attempt to write a simple test in call-lowering. I then noticed that the arguments of gc.relocate were not checked fully, so I went in and fixed a few things. Finally, the most important outcome of this patch is that my new error handling code caught a bug in a callsite in stackmap-format." Differential Revision: http://reviews.llvm.org/D6824 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225412 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:48:01 +00:00
Tom Stellard	a36b682c17	R600/SI: Commute instructions to enable more folding opportunities git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225410 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:44:19 +00:00
Tom Stellard	a3ee583339	R600/SI: Only fold immediates that have one use Folding the same immediate into multiple instruction will increase program size, which can hurt performance. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225405 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 22:18:27 +00:00
Duncan P. N. Exon Smith	c742e3a68d	Linker: Don't use MDNode::replaceOperandWith() `MDNode::replaceOperandWith()` changes all instances of metadata. Stop using it when linking module flags, since (due to uniquing) the flag values could be used by other metadata. Instead, use new API `NamedMDNode::setOperand()` to update the reference directly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225397 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:32:27 +00:00
Alexey Samsonov	ec1494b99f	XFAIL several MCJIT EH tests under ASan and MSan bootstrap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225393 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:27:26 +00:00
Rafael Espindola	5061ecc615	Add a test that would have found the issue in r224935. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225385 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:10:25 +00:00
Kevin Enderby	60e9ca4c0f	Slightly refactor things for llvm-objdump and the -macho option so it can be used with options other than just -disassemble so that universal files can be used with other options combined with -arch options. No functional change to existing options and use. One test case added for the additional functionality with a universal file an a -arch option. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225383 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 21:02:18 +00:00
Olivier Sallenave	033a537a84	More FMA folding opportunities. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225380 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:54:17 +00:00
Adrian Prantl	50bf54ccf4	Reapply: Teach SROA how to update debug info for fragmented variables. The two buildbot failures were addressed in LLVM r225378 and CFE r225359. This rapplies commit 225272 without modifications. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225379 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:52:22 +00:00
Adrian Prantl	7950596b55	Debug info: Allow aggregate types to be described by constants. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225378 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:48:58 +00:00
Colin LeMahieu	51817073b3	[Hexagon] Adding floating point classification and creation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225374 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:28:57 +00:00
Tom Stellard	f7587043ef	R600/SI: Add a V_MOV_B64 pseudo instruction This is used to simplify the SIFoldOperands pass and make it easier to fold immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225373 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:27:25 +00:00
Colin LeMahieu	22ddfae848	[Hexagon] Adding encodings for v5 floating point instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225372 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:24:09 +00:00
Colin LeMahieu	22efbc70a7	[Hexagon] Adding encoding for popcount, fastcorner, dword asr with rounding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225371 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 20:07:28 +00:00
Tom Stellard	546520a727	R600/SI: Teach SIFoldOperands to split 64-bit constants when folding This allows folding of sequences like: s[0:1] = s_mov_b64 4 v_add_i32 v0, s0, v0 v_addc_u32 v1, s1, v1 into v_add_i32 v0, 4, v0 v_add_i32 v1, 0, v1 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225369 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:56:17 +00:00
Philip Reames	28fa9e1e9f	Introduce an example statepoint GC strategy This change includes the most basic possible GCStrategy for a GC which is using the statepoint lowering code. At the moment, this GCStrategy doesn't really do much - aside from actually generate correct stackmaps that is - but I went ahead and added a few extra correctness checks as proof of concept. It's mostly here to provide documentation on how to do one, and to provide a point for various optimization legality hooks I'd like to add going forward. (For context, see the TODOs in InstCombine around gc.relocate.) Most of the validation logic added here as proof of concept will soon move in to the Verifier. That move is dependent on http://reviews.llvm.org/D6811 There was discussion in the review thread about addrspace(1) being reserved for something. I'm going to follow up on a seperate llvmdev thread. If needed, I'll update all the code at once. Note that I am deliberately not making a GCStrategy required to use gc.statepoints with this change. I want to give folks out of tree - including myself - a chance to migrate. In a week or two, I'll make having a GCStrategy be required for gc.statepoints. To this end, I added the gc tag to one of the test cases but not others. Differential Revision: http://reviews.llvm.org/D6808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225365 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 19:07:50 +00:00
David Majnemer	60812c05e7	X86: Allow the stack probe size to be configurable per function LLVM emits stack probes on Windows targets to ensure that the stack is correctly accessed. However, the amount of stack allocated before emitting such a probe is hardcoded to 4096. It is desirable to have this be configurable so that a function might opt-out of stack probes. Our level of granularity is at the function level instead of, say, the module level to permit proper generation of code after LTO. Patch by Andrew H! N.B. The inliner needs to be updated to properly consider what happens after inlining a function with a specific stack-probe-size into another function with a different stack-probe-size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225360 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 18:14:07 +00:00
Ahmed Bougacha	d412c608fc	[X86] Teach FCOPYSIGN lowering to recognize constant magnitudes. For code like: float foo(float x) { return copysign(1.0, x); } We used to generate: andps <-0.000000e+00,0,0,0>, %xmm0 movss <1.000000e+00>, %xmm1 andps <nan>, %xmm1 orps %xmm0, %xmm1 Basically doing an abs(1.0f) in the two middle instructions. We now generate: andps <-0.000000e+00,0,0,0>, %xmm0 orps <1.000000e+00,0,0,0>, %xmm0 Builds on cleanups r223415, r223542. rdar://19049548 Differential Revision: http://reviews.llvm.org/D6555 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225357 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 17:33:03 +00:00
Charlie Turner	7fbbc81d65	[ARM] Add missing Tag_DIV_use tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225348 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 11:37:40 +00:00
Chandler Carruth	7372d445af	[PM] Give slightly less horrible names to the utility pass templates for requiring and invalidating specific analyses. Also make their printed names match their class names. Writing these out as prose really doesn't make sense to me any more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225346 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 11:14:51 +00:00
Karthik Bhat	f2b3638c3d	Revert r225165 and r225169 Even thouh gcc produces simialr instructions as Owen pointed out the two patterns aren’t equivalent in the case where the original subtraction could have caused an overflow. Reverting the same. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225341 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 06:34:34 +00:00
Chandler Carruth	9fc5a53118	[PM] Fix a pretty nasty bug where the new pass manager would invalidate passes too many time. I think this is actually the issue that someone raised with me at the developer's meeting and in an email, but that we never really got to the bottom of. Having all the testing utilities made it much easier to dig down and uncover the core issue. When a pass manager is running many passes over a single function, we need it to invalidate the analyses between each run so that they can be re-computed as needed. We also need to track the intersection of preserved higher-level analyses across all the passes that we run (for example, if there is one module analysis which all the function analyses preserve, we want to track that and propagate it). Unfortunately, this interacted poorly with any enclosing pass adaptor between two IR units. It would see the intersection of preserved analyses, and need to invalidate any other analyses, but some of the un-preserved analyses might have already been invalidated and recomputed! We would fail to propagate the fact that the analysis had already been invalidated. The solution to this struck me as really strange at first, but the more I thought about it, the more natural it seemed. After a nice discussion with Duncan about it on IRC, it seemed even nicer. The idea is that invalidating an analysis causes it to be preserved! Preserving the lack of result is trivial. If it is recomputed, great. Until something else invalidates it again, we're good. The consequence of this is that the invalidate methods on the analysis manager which operate over many passes now consume their PreservedAnalyses object, update it to "preserve" every analysis pass to which it delivers an invalidation (regardless of whether the pass chooses to be removed, or handles the invalidation itself by updating itself). Then we return this augmented set from the invalidate routine, letting the pass manager take the result and use the intersection of that across each pass run to compute the final preserved set. This accounts for all the places where the early invalidation of an analysis has already "preserved" it for a future run. I've beefed up the testing and adjusted the assertions to show that we no longer repeatedly invalidate or compute the analyses across nested pass managers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225333 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-07 01:58:35 +00:00
Matt Arsenault	6a72b20325	R600/SI: Add combine for isinfinite pattern git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225310 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:46 +00:00
Matt Arsenault	42d9f7cf0a	R600/SI: Pattern match isinf to v_cmp_class instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225307 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:41 +00:00
Matt Arsenault	a5b2b64292	R600/SI: Add basic DAG combines for fp_class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225306 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:39 +00:00
Matt Arsenault	b6520ab625	R600/SI: Add class intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225305 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:37 +00:00
Matt Arsenault	374b57cec9	Fix using wrong intrinsic in test This is a leftover from renaming the intrinsic. It's surprising the unknown llvm. intrinsic wasn't rejected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225304 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 23:00:33 +00:00
Rafael Espindola	f907a26bc2	Change the .ll syntax for comdats and add a syntactic sugar. In order to make comdats always explicit in the IR, we decided to make the syntax a bit more compact for the case of a GlobalObject in a comdat with the same name. Just dropping the $name causes problems for @foo = globabl i32 0, comdat $bar = comdat ... and declare void @foo() comdat $bar = comdat ... So the syntax is changed to @g1 = globabl i32 0, comdat($c1) @g2 = globabl i32 0, comdat and declare void @foo() comdat($c1) declare void @foo() comdat git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225302 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:55:16 +00:00
Hal Finkel	8e9ba0e588	[PowerPC] Reuse a load operand in int->fp conversions int->fp conversions on PPC must be done through memory loads and stores. On a modern core, this process begins by storing the int value to memory, then loading it using a (sometimes special) FP load instruction. Unfortunately, we would do this even when the value to be converted was itself a load, and we can just use that same memory location instead of copying it to another first. There is a slight complication when handling int_to_fp(fp_to_int(x)) pairs, because the fp_to_int operand has not been lowered when the int_to_fp is being lowered. We handle this specially by invoking fp_to_int's lowering logic (partially) and getting the necessary memory location (some trivial refactoring was done to make this possible). This is all somewhat ugly, and it would be nice if some later CodeGen stage could just clean this stuff up, but because doing so would involve modifying target-specific nodes (or instructions), it is not immediately clear how that would work. Also, remove a related entry from the README.txt for which we now generate reasonable code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225301 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 22:31:02 +00:00
Colin LeMahieu	a602a7f199	[Hexagon] Adding compound jump encodings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225291 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 20:03:31 +00:00
Tom Stellard	bac89f3dd2	R600/SI: Insert s_waitcnt before s_barrier instructions. This ensures that all memory operations are complete when all threads reach the barrier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225290 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:52:07 +00:00
Adrian Prantl	d2c42b9617	Revert "Reapply: Teach SROA how to update debug info for fragmented variables." because of a tsan buildbot failure. This reverts commit 225272. Fix should be coming soon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225288 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:47:27 +00:00
Colin LeMahieu	3d1d6d9043	[Hexagon] Adding encoding for misc v4 instructions: boundscheck, tlbmatch, dcfetch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225283 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:03:20 +00:00
Sanjoy Das	31123d4529	This patch teaches IndVarSimplify to add nuw and nsw to certain kinds of operations that provably don't overflow. For example, we can prove %civ.inc below does not sign-overflow. With this change, IndVarSimplify changes %civ.inc to an add nsw. define i32 @foo(i32* %array, i32* %length_ptr, i32 %init) { entry: %length = load i32* %length_ptr, !range !0 %len.sub.1 = sub i32 %length, 1 %upper = icmp slt i32 %init, %len.sub.1 br i1 %upper, label %loop, label %exit loop: %civ = phi i32 [ %init, %entry ], [ %civ.inc, %latch ] %civ.inc = add i32 %civ, 1 %cmp = icmp slt i32 %civ.inc, %length br i1 %cmp, label %latch, label %break latch: store i32 0, i32* %array %check = icmp slt i32 %civ.inc, %len.sub.1 br i1 %check, label %loop, label %break break: ret i32 %civ.inc exit: ret i32 42 } Differential Revision: http://reviews.llvm.org/D6748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225282 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 19:02:56 +00:00
Colin LeMahieu	63d0449f11	[Hexagon] Adding encoding information for absolute address loads. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225279 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 18:38:26 +00:00
Tom Stellard	1f996fa36b	R600/SI: Add a stub GCNTargetMachine This is equivalent to the AMDGPUTargetMachine now, but it is the starting point for separating R600 and GCN functionality into separate targets. It is recommened that users start using the gcn triple for GCN-based GPUs, because using the r600 triple for these GPUs will be deprecated in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225277 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 18:00:21 +00:00
Andrea Di Biagio	e46783d5b7	[CodeGenPrepare] Improved logic to speculate calls to cttz/ctlz. This patch improves the logic added at revision 224899 (see review D6728) that teaches the backend when it is profitable to speculate calls to cttz/ctlz. The original algorithm conservatively avoided speculating more than one instruction from a basic block in a control flow grap modelling an if-statement. In particular, the only allowed instruction (excluding the terminator) was a call to cttz/ctlz. However, there are cases where we could be less conservative and still be able to speculate a call to cttz/ctlz. With this patch, CodeGenPrepare now tries to speculate a cttz/ctlz if the result is zero extended/truncated in the same basic block, and the zext/trunc instruction is "free" for the target. Added new test cases to CodeGen/X86/cttz-ctlz.ll Differential Revision: http://reviews.llvm.org/D6853 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225274 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:41:18 +00:00
Adrian Prantl	46cb54c0fb	Reapply: Teach SROA how to update debug info for fragmented variables. This also rolls in the changes discussed in http://reviews.llvm.org/D6766. Defers migrating the debug info for new allocas until after all partitions are created. Thanks to Chandler for reviewing! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225272 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:14:10 +00:00
Filipe Cabecinhas	d682839830	Don't loop endlessly for MachO files with 0 ncmds git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225271 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 17:08:26 +00:00
Hal Finkel	15914b5c22	[PowerPC] Add a regression test for r225251 In r225251, I removed an old entry from the README.txt file. While there are several contributing factors (including pieces in Clang's ABI code), upon further reflection, the backend part deserves a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225268 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:46:37 +00:00
Colin LeMahieu	a24e012976	[Hexagon] Adding dealloc_return encoding and absolute address stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225267 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 16:15:15 +00:00
Matt Arsenault	d883ca0ca7	Convert fcmp with 0.0 from casted integers to icmp This is already handled in general when it is known the conversion can't lose bits with smaller integer types casted into wider floating point types. This pattern happens somewhat often in GPU programs that cast workitem intrinsics to float, which are often compared with 0. Specifically handle the special case of compares with zero which should also be known to not lose information. I had a more general version of this which allows equality compares if the casted float is exactly representable in the integer, but I'm not 100% confident that is always correct. Also fold cases that aren't integers to true / false. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225265 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 15:50:59 +00:00
Chandler Carruth	0df89054c0	[PM] Introduce a utility pass that preserves no analyses. Use this to test that path of invalidation. This test actually shows redundant invalidation here that is really bad. I'm going to work on fixing that next, but wanted to commit the test harness now that its all working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225257 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 09:06:35 +00:00
Craig Topper	f6145affbf	[X86] Add OpSize32 to XBEGIN_4. Add XBEGIN_2 with OpSize16. Requires new AsmParserOperand types that detect 16-bit and 32/64-bit mode so that we choose the right instruction based on default sizing without predicates. This is necessary since predicates mess up the disassembler table building. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225256 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:59:30 +00:00
David Majnemer	51e4a66417	InstCombine: Bitcast call arguments from/to pointer/integer type Try harder to get rid of bitcast'd calls by ptrtoint/inttoptr'ing arguments and return values when DataLayout says it is safe to do so. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225254 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:41:31 +00:00
Chandler Carruth	6409bd68de	[PM] Simplify how we parse the outer layer of the pass pipeline text and remove an extra, redundant pass manager wrapping every run. I had kept seeing these when manually testing, but it was getting really annoying and was going to cause problems with overly eager invalidation. The root cause was an overly complex and unnecessary pile of code for parsing the outer layer of the pass pipeline. We can instead delegate most of this to the recursive pipeline parsing. I've added some somewhat more basic and precise tests to catch this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225253 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 08:37:58 +00:00
David Majnemer	b3065539bd	X86: Don't make illegal GOTTPOFF relocations "ELF Handling for Thread-Local Storage" specifies that R_X86_64_GOTTPOFF relocation target a movq or addq instruction. Prohibit the truncation of such loads to movl or addl. This fixes PR22083. Differential Revision: http://reviews.llvm.org/D6839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225250 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 07:12:52 +00:00
Hal Finkel	10ae865847	[PowerPC] Improve int_to_fp(fp_to_int(x)) combining The old target DAG combine that allowed for performing int_to_fp(fp_to_int(x)) without a load/store pair is updated here with support for unsigned integers, and to support single-precision values without a third rounding step, on newer cores with the appropriate instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225248 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 06:01:57 +00:00
Chandler Carruth	17395fa733	[PM] Add a utility pass template that synthesizes the invalidation of a specific analysis result. This is quite handy to test things, and will also likely be very useful for debugging issues. You could narrow down pass validation failures by walking these invalidate pass runs up and down the pass pipeline, etc. I've added support to the pass pipeline parsing to be able to create one of these for any analysis pass desired. Just adding this class uncovered one latent bug where the AnalysisManager CRTP base class had a hard-coded Module type rather than using IRUnitT. I've also added tests for invalidation and caching of analyses in a basic way across all the pass managers. These in turn uncovered two more bugs where we failed to correctly invalidate an analysis -- its results were invalidated but the key for re-running the pass was never cleared and so it was never re-run. Quite nasty. I'm very glad to debug this here rather than with a full system. Also, yes, the naming here is horrid. I'm going to update some of the names to be slightly less awful shortly. But really, I've no "good" ideas for naming. I'll be satisfied if I can get it to "not bad". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225246 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 04:49:44 +00:00
Chandler Carruth	5b12a2f703	[PM] Add a collection of no-op analysis passes and switch the new pass manager tests to use them and be significantly more comprehensive. This, naturally, uncovered a bug where the CGSCC pass manager wasn't printing analyses when they were run. The only remaining core manipulator is I think an invalidate pass similar to the require pass. That'll be next. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225240 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 02:50:06 +00:00
Chandler Carruth	a3376d2d36	[PM] Add a utility to the new pass manager for generating a pass which is a no-op other than requiring some analysis results be available. This can be used in real pass pipelines to force the usually lazy analysis running to eagerly compute something at a specific point, and it can be used to test the pass manager infrastructure (my primary use at the moment). I've also added bit of pipeline parsing magic to support generating these directly from the opt command so that you can directly use these when debugging your analysis. The syntax is: require<analysis-name> This can be used at any level of the pass manager. For example: cgscc(function(require<my-analysis>,no-op-function)) This would produce a no-op function pass requiring my-analysis, followed by a fully no-op function pass, both of these in a function pass manager which is nested inside of a bottom-up CGSCC pass manager which is in the top-level (implicit) module pass manager. I have zero attachment to the particular syntax I'm using here. Consider it a straw man for use while I'm testing and fleshing things out. Suggestions for better syntax welcome, and I'll update everything based on any consensus that develops. I've used this new functionality to more directly test the analysis printing rather than relying on the cgscc pass manager running an analysis for me. This is still minimally tested because I need to have analyses to run first! ;] That patch is next, but wanted to keep this one separate for easier review and discussion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225236 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 02:10:51 +00:00
Rafael Espindola	5165dfdf9a	Add a testcase that would have found the problem in r225048. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225235 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 01:41:24 +00:00
Lang Hames	bce877c84c	Revert r225048: It broke ObjC on AArch64. I've filed http://llvm.org/PR22100 to track this issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225228 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-06 00:54:32 +00:00
Hal Finkel	38c3e2f5c5	[PowerPC] Fix test to pass on Darwin hosts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225220 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 23:17:43 +00:00
Hal Finkel	fcfee17911	[PowerPC] Convert a README.txt entry into a better test We now produce the desired code as noted in the README.txt file (no spurious or). Remove the README entry and improve the regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225214 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:53:52 +00:00
Colin LeMahieu	e4f1dcdb83	[Hexagon] Adding add/sub with carry, logical shift left by immediate and memop instructions. Removing old defs without bits and updating references. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225210 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:36:38 +00:00
Hal Finkel	1b84bf2554	[PowerPC] Add a test for truncating a shifted load We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225209 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:33:14 +00:00
Frederic Riss	5a0743e1e8	[dsymutil] Implement the BinaryHolder object and gain archive support. This object is meant to own the ObjectFiles and their underlying MemoryBuffer. It is basically the equivalent of an OwningBinary except that it efficiently handles Archives. It is optimized for efficiently providing mappings of members of the same archive when they are opened successively (which is standard in Darwin debug maps, objects from the same archive will be contiguous). Of course, the BinaryHolder will also be used by the DWARF linker once it is commited, but for now only the debug map parser uses it. With this change, you can run llvm-dsymutil on your Darwin debug build of clang and get a complete debug map for it. Differential Revision: http://reviews.llvm.org/D6690 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225207 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:29:28 +00:00
Hal Finkel	e7d845b709	[PowerPC] Add another test for load/store with update We now produce the desired code as noted in the README.txt file. Remove the README entry and add a regression test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225205 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:22:42 +00:00
Hal Finkel	ccc83e4a08	[PowerPC] Fold i1 extensions with other ops Consider this function from our README.txt file: int foo(int a, int b) { return (a < b) << 4; } We now explicitly track CR bits by default, so the comment in the README.txt about not really having a SETCC is no longer accurate, but we did generate this somewhat silly code: cmpw 0, 3, 4 li 3, 0 li 12, 1 isel 3, 12, 3, 0 sldi 3, 3, 4 blr which generates the zext as a select between 0 and 1, and then shifts the result by a constant amount. Here we preprocess the DAG in order to fold the results of operations on an extension of an i1 value into the SELECT_I[48] pseudo instruction when the resulting constant can be materialized using one instruction (just like the 0 and 1). This was not implemented as a DAGCombine because the resulting code would have been anti-canonical and depends on replacing chained user nodes, which does not fit well into the lowering paradigm. Now we generate: cmpw 0, 3, 4 li 3, 0 li 12, 16 isel 3, 12, 3, 0 blr which is less silly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225203 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 21:10:24 +00:00
Colin LeMahieu	ca96263b05	[Hexagon] Adding rounding reg/reg variants, accumulating multiplies, and accumulating shifts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225201 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:56:41 +00:00
Colin LeMahieu	27494b0633	[Hexagon] Adding V4 bit manipulating instructions, removing ALU defs without encoding bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225199 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:35:54 +00:00
Colin LeMahieu	c8e734a561	[Hexagon] Adding V4 logic-logic instructions and tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225198 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:14:58 +00:00
Colin LeMahieu	e48ec2a918	[Hexagon] Adding orand, bitsplit reg/reg, and modwrap instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225197 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 20:04:40 +00:00
Hal Finkel	3ab10c1918	[PowerPC] Remove zexts after i32 ctlz The 64-bit semantics of cntlzw are not special, the 32-bit population count is stored as a 64-bit value in the range [0,32]. As a result, it is always zero extended, and it can be added to the PPCISelDAGToDAG peephole optimization as a frontier instruction for the removal of unnecessary zero extensions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225192 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:52:29 +00:00
Hal Finkel	0ef99720c5	[PowerPC] Remove zexts after byte-swapping loads lhbrx and lwbrx not only load their data with byte swapping, but also clear the upper 32 bits (at least). As a result, they can be added to the PPCISelDAGToDAG peephole optimization as frontier instructions for the removal of unnecessary zero extensions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225189 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:09:06 +00:00
Colin LeMahieu	9e989cf190	[Hexagon] Adding round reg/imm and bitsplit instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225188 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 18:08:21 +00:00
Ahmed Bougacha	3c9fb6e1ad	[AArch64] Improve codegen of store lane instructions by avoiding GPR usage. We used to generate code similar to: umov.b w8, v0[2] strb w8, [x0, x1] because the STRro patterns were preferred to ST1. Instead, we can avoid going through GPRs, and generate: add x8, x0, x1 st1.b { v0 }[2], [x8] This patch increases the ST1 AddedComplexity to achieve that. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6202 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225183 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 17:10:26 +00:00
Ahmed Bougacha	c52cd839b9	[AArch64] Improve codegen of store lane 0 instructions by directly storing the subregister. For 0-lane stores, we used to generate code similar to: fmov w8, s0 str w8, [x0, x1, lsl #2] instead of: str s0, [x0, x1, lsl #2] To correct that: for store lane 0 patterns, directly match to STR <subreg>0. Byte-sized instructions don't have the special case for a 0 index, because FPR8s are defined to have untyped content. rdar://16372710 Differential Revision: http://reviews.llvm.org/D6772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225181 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 17:02:28 +00:00
NAKAMURA Takumi	19d9f342ed	llvm/test/lit.cfg: have_ld_plugin_support(): Use decode() for stdout. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225171 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 14:18:04 +00:00
Karthik Bhat	050064d32c	Select lower fsub,fabs pattern to fabd on AArch64 This patch lowers patterns such as- fsub v0.4s, v0.4s, v1.4s fabs v0.4s, v0.4s to fabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225169 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:57:59 +00:00
Charlie Turner	6abfc44aab	Parse Tag_compatibility correctly. Tag_compatibility takes two arguments, but before this patch it would erroneously accept just one, it now produces an error in that case. Change-Id: I530f918587620d0d5dfebf639944d6083871ef7d git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225167 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:26:37 +00:00
Charlie Turner	b99b8ffb7f	Emit the build attribute Tag_conformance. Claim conformance to version 2.09 of the ARM ABI. This build attribute must be emitted first amongst the build attributes when written to an object file. This is to simplify conformance detection by consumers. Change-Id: If9eddcfc416bc9ad6e5cc8cdcb05d0031af7657e git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225166 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:12:17 +00:00
Karthik Bhat	e239724d12	Select lower sub,abs pattern to sabd on AArch64 This patch lowers patterns such as- sub v0.4s, v0.4s, v1.4s abs v0.4s, v0.4s to sabd v0.4s, v0.4s, v1.4s on AArch64. Review: http://reviews.llvm.org/D6781 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225165 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 13:11:07 +00:00
Michael Kuperstein	25903ef9bc	Fix broken test from r225159. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225164 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:34:01 +00:00
Chandler Carruth	1ab487fdc7	[PM] Don't run the machinery of invalidating all the analysis passes when all are being preserved. We want to short-circuit this for a couple of reasons. One, I don't really want passes to grow a dependency on actually receiving their invalidate call when they've been preserved. I'm thinking about removing this entirely. But more importantly, preserving everything is likely to be the common case in a lot of scenarios, and it would be really good to bypass all of the invalidation and preservation machinery there. Avoiding calling N opaque functions to try to invalidate things that are by definition still valid seems important. =] This wasn't really inpsired by much other than seeing the spam in the logging for analyses, but it seems better ot get it checked in rather than forgetting about it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225163 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:32:11 +00:00
Chandler Carruth	040ca449b2	[PM] Add names and debug logging for analysis passes to the new pass manager. This starts to allow us to test analyses more easily, but it's really only the beginning. Some of the code here is still untestable without manual changes to create analysis passes, but I wanted to factor it into a small of chunks as possible. Next up in order to be able to test things are, in no particular order: - No-op analyses passes so we don't have to use real ones to exercise the pass maneger itself. - Automatic way of generating dummy passes that require an analysis be run, including a variant that calls a 'print' method on a pass to make it even easier to print out the results of an analysis. - Dummy passes that invalidate all analyses for their IR unit so we can test invalidation and re-runs. - Automatic way to print each analysis pass as it is re-run. - Automatic but optional verification of analysis passes everywhere possible. I'm not claiming I'll get to all of these immediately, but that's what is in the pipeline at some stage. I'm fleshing out exactly what I need and what to prioritize by working on converting analyses and then trying to test the conversion. =] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225162 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 12:21:44 +00:00
Jiangning Liu	614fe873ce	Fixed a bug in memory dependence checking module of loop vectorization. The following loop should not be vectorized with current algorithm. {code} // loop body ... = a[i] (1) ... = a[i+1] (2) ....... a[i+1] = .... (3) a[i] = ... (4) {code} The algorithm tries to collect memory access candidates from AliasSetTracker, and then check memory dependences one another. The memory accesses are unique in AliasSetTracker, and a single memory access in AliasSetTracker may map to multiple entries in AccessAnalysis, which could cover both 'read' and 'write'. Originally the algorithm only checked 'write' entry in Accesses if only 'write' exists. This is incorrect and the consequence is it ignored all read access, and finally some RAW and WAR dependence are missed. For the case given above, if we ignore two reads, the dependence between (1) and (3) would not be able to be captured, and finally this loop will be incorrectly vectorized. The fix simply inserts a new loop to find all entries in Accesses. Since it will skip most of all other memory accesses by checking the Value pointer at the very beginning of the loop, it should not increase compile-time visibly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225159 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 10:08:58 +00:00
Hal Finkel	0ef8f3189e	[PowerPC] Enable speculation of cttz/ctlz PPC has an instruction for ctlz with defined zero behavior, and our lowering of cttz (provided by DAGCombine) is also efficient and branchless, so speculating these makes sense. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225150 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 05:24:42 +00:00
Chandler Carruth	4f9a7277d1	[SROA] Apply a somewhat heavy and unpleasant hammer to fix PR22093, an assert out of the new pre-splitting in SROA. This fix makes the code do what was originally intended -- when we have a store of a load both dealing in the same alloca, we force them to both be pre-split with identical offsets. This is really quite hard to do because we can keep discovering problems as we go along. We have to track every load over the current alloca which for any resaon becomes invalid for pre-splitting, and go back to remove all stores of those loads. I've included a couple of test cases derived from PR22093 that cover the different ways this can happen. While that PR only really triggered the first of these two, its the same fundamental issue. The other challenge here is documented in a FIXME now. We end up being quite a bit more aggressive for pre-splitting when loads and stores don't refer to the same alloca. This aggressiveness comes at the cost of introducing potentially redundant loads. It isn't clear that this is the right balance. It might be considerably better to require that we only do pre-splitting when we can presplit every load and store involved in the entire operation. That would give more consistent if conservative results. Unfortunately, it requires a non-trivial change to the actual pre-splitting operation in order to correctly handle cases where we end up pre-splitting stores out-of-order. And it isn't 100% clear that this is the right direction, although I'm starting to suspect that it is. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225149 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 04:17:53 +00:00
Hal Finkel	9cad6c8a24	[PowerPC] Materialize i64 constants using rotation with masking r225135 added the ability to materialize i64 constants using rotations in order to reduce the instruction count. Sometimes we can use a rotation only with some extra masking, so that we take advantage of the fact that generating a bunch of extra higher-order 1 bits is easy using li/lis. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225147 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 03:41:38 +00:00
Chandler Carruth	51fa09d980	[PM] Wire up support for explicitly running the verifier pass. The required functionality has been there for some time, but I never managed to actually wire it into the command line registry of passes. Let's do that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225144 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-05 00:08:53 +00:00
Simon Pilgrim	c0c36083da	[X86][SSE] Added vector packing test for pr12412 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225138 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 19:08:03 +00:00
Simon Pilgrim	dc18ec0e0d	[X86][SSE] Added vector integer truncation tests - based off pr15524 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225137 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 17:52:00 +00:00
Hal Finkel	2ac0826af3	[PowerPC] Materialize i64 constants using rotation Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing a rotated constant, and then applying the inverse rotation, requires fewer instructions than the direct method. If so, do that instead. In r225132, I added support for forming constants using bit inversion. In effect, this reverts that commit and replaces it with rotation support. The bit inversion is useful for turning constants that are mostly ones into ones that are mostly zeros (thus enabling a more-efficient shift-based materialization), but the same effect can be obtained by using negative constants and a rotate, and that is at least as efficient, if not more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225135 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 15:43:55 +00:00
Hal Finkel	d138a7bb3f	[PowerPC] Materialize i64 constants using bit inversion Materializing full 64-bit constants on PPC64 can be expensive, requiring up to 5 instructions depending on the locations of the non-zero bits. Sometimes materializing the bit-reversed constant, and then flipping the bits, requires fewer instructions than the direct method. If so, do that instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225132 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 12:35:03 +00:00
David Majnemer	07d7dbae9e	InstCombine: match can find ConstantExprs, don't assume we have a Value We assumed the output of a match was a Value, this would cause us to assert because we would fail a cast<>. Instead, use a helper in the Operator family to hide the distinction between Value and Constant. This fixes PR22087. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225127 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:36:02 +00:00
David Majnemer	77e22b7836	ValueTracking: ComputeNumSignBits should tolerate misshapen phi nodes PHI nodes can have zero operands in the middle of a transform. It is expected that utilities in Analysis don't freak out when this happens. Note that it is considered invalid to allow these misshapen phi nodes to make it to another pass. This fixes PR22086. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-04 07:06:53 +00:00
Saleem Abdulrasool	b19a485253	llvm-readobj: add support to dump COFF export tables This enhances llvm-readobj to print out the COFF export table, similar to the -coff-import option. This is useful for testing in lld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225120 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 21:35:09 +00:00
Saleem Abdulrasool	97f8f69a7f	ARM: permit tail calls to weak externals on COFF Weak externals are resolved statically, so we can actually generate the tail call on PE/COFF targets without breaking the requirements. It is questionable whether we want to propagate the current behaviour for MachO as the requirements are part of the ARM ELF specifications, and it seems that prior to the SVN r215890, we would have tail'ed the call. For now, be conservative and only permit it on PE/COFF where the call will always be fully resolved. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225119 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 21:35:00 +00:00
Hal Finkel	e05b232c20	[PowerPC/BlockPlacement] Allow target to provide a per-loop alignment preference The existing code provided for specifying a global loop alignment preference. However, the preferred loop alignment might depend on the loop itself. For recent POWER cores, loops between 5 and 8 instructions should have 32-byte alignment (while the others are better with 16-byte alignment) so that the entire loop will fit in one i-cache line. To support this, getPrefLoopAlignment has been made virtual, and can be provided with an optional MachineLoop* so the target can inspect the loop before answering the query. The default behavior, as before, is to return the value set with setPrefLoopAlignment. MachineBlockPlacement now queries the target for each loop instead of only once per function. There should be no functional change for other targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 17:58:24 +00:00
Hal Finkel	a1d22cc789	[PowerPC] Use 16-byte alignment for modern cores for functions/loops Most modern PowerPC cores prefer that functions and loops start on 16-byte-aligned boundaries (), so instruct block placement, etc. to make this happen. The branch selector has also been adjusted so account for the extra nops that might now be inserted before loop headers. () Some cores actually prefer other alignments for small loops, but that will be addressed in a follow-up commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225115 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 14:58:25 +00:00
Hal Finkel	958b670c34	[PowerPC] Add support for the CMPB instruction Newer POWER cores, and the A2, support the cmpb instruction. This instruction compares its operands, treating each of the 8 bytes in the GPRs separately, returning a 'mask' result of 0 (for false) or -1 (for true) in each byte. Code generation support is added, in the form of a PPCISelDAGToDAG DAG-preprocessing routine, that recognizes patterns close to what the instruction computes (either exactly, or related by a constant masking operation), and generates the cmpb instruction (along with any necessary constant masking operation). This can be expanded if use cases arise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@225106 91177308-0d34-0410-b5e6-96231b3b80d8	2015-01-03 01:16:37 +00:00

... 3 4 5 6 7 ...

28185 Commits