llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-09-03 02:30:00 +00:00

Author	SHA1	Message	Date
Pavel Chupin	586994a74e	[x32] Emit callq for CALLpcrel32 Summary: In AT&T annotation for both x86_64 and x32 calls should be printed as callq in assembly. It's only a matter of correct mnemonic, object output is ok. Test Plan: trivial test added Reviewers: nadav, dschuff, craig.topper Subscribers: llvm-commits, zinovy.nis Differential Revision: http://reviews.llvm.org/D5213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217435 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-09 11:54:12 +00:00
Renato Golin	ccfbbaca3f	ARM: Negative offset support problem This patch is to permit a negative offset usage for a non frame access. Patch by Igor Oblakov. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217431 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-09 09:57:59 +00:00
Bob Wilson	086832979b	Set trunc store action to Expand for all X86 targets. When compiling without SSE2, isTruncStoreLegal(F64, F32) would return Legal, whereas with SSE2 it would return Expand. And since the Target doesn't seem to actually handle a truncstore for double -> float, it would just output a store of a full double in the space for a float hence overwriting other bits on the stack. Patch by Luqman Aden! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217410 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-09 01:13:36 +00:00
Hans Wennborg	4cd53531fd	Fast-ISel: Remove dead code after falling back from selecting call instructions (PR20863) Previously, fast-isel would not clean up after failing to select a call instruction, because it would have called flushLocalValueMap() which moves the insertion point, making SavedInsertPt in selectInstruction() invalid. Fixing this by making SavedInsertPt a member variable, and having flushLocalValueMap() update it. This removes some redundant code at -O0, and more importantly fixes PR20863. Differential Revision: http://reviews.llvm.org/D5249 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217401 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-08 20:24:10 +00:00
Matt Arsenault	ef4bb30475	R600/SI: Replace LDS atomics with no return versions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217379 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-08 15:07:31 +00:00
Chad Rosier	b30d031de4	[AArch64] Improve AA to remove unneeded edges in the AA MI scheduling graph. Patch by Sanjin Sijaric <ssijaric@codeaurora.org>! Phabricator Review: http://reviews.llvm.org/D5103 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217371 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-08 14:43:48 +00:00
Chandler Carruth	8ceea90956	[x86] Revert my over-eager commit in r217332. I hadn't actually run all the tests yet and these combines have somewhat surprisingly far reaching effects. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217333 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-07 12:37:11 +00:00
Chandler Carruth	e328c5ea83	[x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and add support for MOVDDUP which is really important for matrix multiply style operations that do lots of non-vector-aligned load and splats. The original motivation was to add support for MOVDDUP as the lack of it regresses matmul_f64_4x4 by 5% or so. However, all of the rules here were somewhat suspicious. First, we should always be using the floating point domain shuffles, regardless of how many copies we have to make as a movapd is crazy faster than the domain switching cost on some chips. (Mostly because movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick of the PSHUF instructions, there is no need to avoid canonicalizing on UNPCK variants, so do that canonicalizing. This also ensures we have the chance to form MOVDDUP. =] Second, we assume SSE2 support when doing any vector lowering, and given that we should just use UNPCKLPD and UNPCKHPD as they can operate on registers or memory. If vectors get spilled or come from memory at all this is going to allow the load to be folded into the operation. If we want to optimize for encoding size (the only difference, and only a 2 byte difference) it should be done much later, likely after RA. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217332 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-07 12:02:14 +00:00
Matt Arsenault	6a712e709d	R600/SI: Relax a few tests to help enable scheduler git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217320 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-06 20:44:41 +00:00
Matt Arsenault	360ed46f68	R600/SI: Fix broken check lines. Fix missing check, and hardcoded register numbers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217318 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-06 20:37:56 +00:00
Chandler Carruth	7cd7154421	[x86] Fix a pretty horrible bug and inconsistency in the x86 asm parsing (and latent bug in the instruction definitions). This is effectively a revert of r136287 which tried to address a specific and narrow case of immediate operands failing to be accepted by x86 instructions with a pretty heavy hammer: it introduced a new kind of operand that behaved differently. All of that is removed with this commit, but the test cases are both preserved and enhanced. The core problem that r136287 and this commit are trying to handle is that gas accepts both of the following instructions: insertps $192, %xmm0, %xmm1 insertps $-64, %xmm0, %xmm1 These will encode to the same byte sequence, with the immediate occupying an 8-bit entry. The first form was fixed by r136287 but that broke the prior handling of the second form! =[ Ironically, we would still emit the second form in some cases and then be unable to re-assemble the output. The reason why the first instruction failed to be handled is because prior to r136287 the operands ere marked 'i32i8imm' which forces them to be sign-extenable. Clearly, that won't work for 192 in a single byte. However, making thim zero-extended or "unsigned" doesn't really address the core issue either because it breaks negative immediates. The correct fix is to make these operands 'i8imm' reflecting that they can be either signed or unsigned but must be 8-bit immediates. This patch backs out r136287 and then changes those places as well as some others to use 'i8imm' rather than one of the extended variants. Naturally, this broke something else. The custom DAG nodes had to be updated to have a much more accurate type constraint of an i8 node, and a bunch of Pat immediates needed to be specified as i8 values. The fallout didn't end there though. We also then ceased to be able to match the instruction-specific intrinsics to the instructions so modified. Digging, this is because they too used i32 rather than i8 in their signature. So I've also switched those intrinsics to i8 arguments in line with the instructions. In order to make the intrinsic adjustments of course, I also had to add auto upgrading for the intrinsics. I suspect that the intrinsic argument types may have led everything down this rabbit hole. Pretty happy with the result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217310 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-06 10:00:01 +00:00
Chandler Carruth	469c73bc27	[x86] Fix an embarressing bug in the INSERTPS formation code. The mask computation was totally wrong, but somehow it didn't really show up with llc. I've added an assert that triggers on multiple existing test cases and updated one of them to show the correct value. There appear to still be more bugs lurking around insertps's mask. =/ However, note that this only really impacts the new vector shuffle lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217289 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 23:19:45 +00:00
Sanjay Patel	52af82df95	Allow vector fsub ops with constants to get the same optimizations as scalars. This problem is bigger than just fsub, but this is the minimum fix to solve fneg for PR20556 ( http://llvm.org/bugs/show_bug.cgi?id=20556 ), and we solve zero subtraction with the same change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217286 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 22:26:22 +00:00
Rafael Espindola	eaa85e2027	Revert "Disable the fix for pr20793 because of a gnu ld bug." This reverts commit r217211. Both the bfd ld and gold outputs were valid. They were using a Rela relocation, so the value present in the relocated location was not used, which caused me to misread the output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217264 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 18:03:38 +00:00
Matt Arsenault	89a7e3ec3e	R600/SI: Use same complex patterns for DS atomics This fixes hitting the same negative base offset problem that was already fixed for regular loads and stores. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217256 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 16:24:58 +00:00
Jan Vesely	286f644bce	R600: Fix FROUND round halfway cases away from zero Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu> Reviewed-by: Tom Stellard <tom@stellard.net> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217250 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 14:26:54 +00:00
Tom Stellard	7cda2d0666	R600/SI: Use S_ADD_U32 and S_SUB_U32 for low half of 64-bit operations https://bugs.freedesktop.org/show_bug.cgi?id=83416 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217248 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 14:07:59 +00:00
Chandler Carruth	c1c5dcf069	[x86] Factor out the zero vector insertion logic in the new vector shuffle lowering for integer vectors and share it from v4i32, v8i16, and v16i8 code paths. Ironically, the SSE2 v16i8 code for this is now better than the SSSE3! =] Will have to fix the SSSE3 code next to just using a single pshufb. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217240 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 10:36:31 +00:00
Jiangning Liu	b20b9bf9fd	[AArch64] Add pass to enable additional comparison optimizations by CSE. Patched by Sergey Dmitrouk. This pass tries to make consecutive compares of values use same operands to allow CSE pass to remove duplicated instructions. For this it analyzes branches and adjusts comparisons with immediate values by converting: GE -> GT GT -> GE LT -> LE LE -> LT and adjusting immediate values appropriately. It basically corrects two immediate values towards each other to make them equal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217220 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 02:55:24 +00:00
Rafael Espindola	6cf4a0f506	Disable the fix for pr20793 because of a gnu ld bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217211 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-05 00:14:12 +00:00
Rafael Espindola	295a0088db	Fix pr20793. With this patch the third field of llvm.global_ctors is also used on ELF. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217202 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 23:03:58 +00:00
Tim Northover	8dcac5d77a	AArch64: fix vector-immediate BIC/ORR on big-endian devices. Follow up to r217138, extending the logic to other NEON-immediate instructions. As before, the instruction already performs the correct operation and we're just using a different type for convenience, so we want a true nop-cast. Patch by Asiri Rathnayake. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217159 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 15:05:24 +00:00
Tim Northover	dfe4e3e706	AArch64: fix big-endian immediate materialisation We were materialising big-endian constants using DAG nodes with types different from what was requested, followed by a bitcast. This is fine on little-endian machines where bitcasting is a nop, but we need a slightly different representation for big-endian. This adds a new set of NVCAST (natural-vector cast) operations which are always nops. Patch by Asiri Rathnayake. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217138 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 09:46:14 +00:00
Chandler Carruth	ae98867126	[x86] Teach the new v4i32 shuffle lowering some more tricks to recognize vzext patterns and insert-element patterns that for SSE4 have dedicated instructions. With this we can enable the experimental mode in a regression test that happens to cover some of the past set of issues. You can see that the new logic does significantly better here on the floating point cases. A follow-up to this change and the previous ones will hoist the logic into helpers so it can be shared across element type sizes as in this particular case it generalizes cleanly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217136 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 09:26:30 +00:00
Juergen Ributzka	cd72c216cd	Revert r216803 "[MachineSinking] Clear kill flag of all operands at all their uses." This reverts commit r216803, because it might have broken the buildbot. The issue is tracked in PR20842. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217120 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 02:07:36 +00:00
Juergen Ributzka	68a4ab08b3	[FastISel][AArch64] Add target-specific lowering for logical operations. This change adds support for immediate and shift-left folding into logical operations. This fixes rdar://problem/18223183. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217118 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 01:29:18 +00:00
Chandler Carruth	fa2dfaedf2	[x86] Teach the new vector shuffle lowering about the zero masking abilities of INSERTPS which are really powerful and come up in very important contexts such as forming diagonal matrices, etc. With this I ended up being able to remove the somewhat weird helper I added for INSERTPS because we can collapse the entire state to a no-op mask. Added a bunch of tests for inserting into a zero-ish vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217117 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 01:13:48 +00:00
Matt Arsenault	c9cc488dfe	R600/SI: Try to keep i32 mul on SALU Also fix bug this exposed where when legalizing an immediate operand, a v_mov_b32 would be created with a VSrc dest register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217108 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 23:24:35 +00:00
Chandler Carruth	699fd1909e	[x86] Teach the new vector shuffle lowering about the simplest of 'insertps' patterns. This replaces two shuffles with a single insertps in very common cases. My next patch will extend this to leverage the zeroing capabilities of insertps which will allow it to be used in a much wider set of cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217100 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 22:48:34 +00:00
Chandler Carruth	36cf5d68be	[x86] Add an SSE4.1 mode to this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 20:39:06 +00:00
Chandler Carruth	87508f1d87	[x86] Make this test check everything for both SSE2 and AVX1 modes, using a common 'all' prefix for the common test output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217063 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 19:39:10 +00:00
Lang Hames	07ad198d6c	Add a regression test to sanity check the PBQP allocator. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217057 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 18:04:10 +00:00
Tom Stellard	ce4caf146f	R600/SI: Add a pattern for i64 and in a branch git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217041 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 15:22:41 +00:00
Renato Golin	218805d21d	Check-label a bit more specific Sometimes, the .file could be reordered and it'd identify the ldr in the filename as a bad match. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217037 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 13:32:08 +00:00
Alexander Potapenko	fac68d2d70	Fix PR20800: correctly calculate the offset of the subq instruction when generating compact unwind info. This CL replaces the constant DarwinX86AsmBackend.PushInstrSize with a method that lets the backend account for different sizes of "push %reg" instruction sizes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217020 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 07:11:34 +00:00
Juergen Ributzka	847547086d	Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."" This reapplies r216805 with a fix to a copy-past error, which resulted in an incorrect register class. Original commit message: Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217019 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 07:07:10 +00:00
Juergen Ributzka	dd7a7107c1	[FastISel][AArch64] Add target-dependent instruction selection for Add/Sub. There is already target-dependent instruction selection support for Adds/Subs to support compares and the intrinsics with overflow check. This takes advantage of the existing infrastructure to also support Add/Sub, which allows the folding of immediates, sign-/zero-extends, and shifts. This fixes rdar://problem/18207316. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217007 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 01:38:36 +00:00
Renato Golin	418103c4d4	Missing test from r216989 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216990 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 22:46:18 +00:00
Renato Golin	ddcf3bd0a0	Only emit movw on ARMv6T2+ Fix PR18364. Patch by Dimitry Andric. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216989 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 22:45:13 +00:00
Juergen Ributzka	79ec2ed417	[FastISel][AArch64] Use the target-dependent selection code for shifts first. This uses the target-dependent selection code for shifts first, which allows us to create better code for shifts with immediates and sign-/zero-extend folding. Vector type are not handled yet and the code falls back to target-independent instruction selection for these cases. This fixes rdar://problem/17907920. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216985 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 22:33:57 +00:00
Robin Morisset	76b55cc4b1	[X86] Allow atomic operations using immediates to avoid using a register The only valid lowering of atomic stores in the X86 backend was mov from register to memory. As a result, storing an immediate required a useless copy of the immediate in a register. Now these can be compiled as a simple mov. Similarily, adding/and-ing/or-ing/xor-ing an immediate to an atomic location (but through an atomic_store/atomic_load, not a fetch_whatever intrinsic) can now make use of an 'add $imm, x(%rip)' instead of using a register. And the same applies to inc/dec. This second point matches the first issue identified in http://llvm.org/bugs/show_bug.cgi?id=17281 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216980 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 22:16:29 +00:00
Matt Arsenault	2aab51a118	R600/SI: Relax some ordering in tests. This will help with enabling misched git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216971 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 21:45:50 +00:00
Matt Arsenault	9c21df64a4	R600/SI: Fix hardcoded register numbers in test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216944 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 20:43:07 +00:00
Matt Arsenault	f471c483e6	R600/SI: Add failing testcase. This is broken when 64-bit add is only partially moved to the VALU. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216933 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 19:12:31 +00:00
Matt Arsenault	f7a3c7e705	Fix interference caused by fmul 2, x -> fadd x, x If an fmul was introduced by lowering, it wouldn't be folded into a multiply by a constant since the earlier combine would have replaced the fmul with the fadd. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216932 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 19:02:53 +00:00
Reid Kleckner	f93099eb1c	CodeGen: Handle va_start in the entry block Also fix a small copy-paste bug in X86ISelLowering where Chain should have been used in place of DAG.getEntryToken(). Fixes PR20828. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216929 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 18:42:44 +00:00
Hal Finkel	2633f795c6	Enable splitting indexing from loads with TargetConstants When I recommitted r208640 (in r216898) I added an exclusion for TargetConstant offsets, as there is no guarantee that a backend can handle them on generic ADDs (even if it generates them during address-mode matching) -- and, specifically, applying this transformation directly with TargetConstants caused a self-hosting failure on PPC64. Ignoring all TargetConstants, however, is less than ideal. Instead, for non-opaque constants, we can convert them into regular constants for use with the generated ADD (or SUB). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216908 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 16:05:23 +00:00
Rafael Espindola	1e556a80ff	Replace -use-init-array with -use-ctors. We have been using .init-array for most systems for quiet some time, but tools like llc are still defaulting to .ctors because the old option was never changed. This patch makes llc default to .init-array and changes the option to be -use-ctors. Clang is not affected by this. It has its own fancier logic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216905 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 13:54:53 +00:00
David Xu	4e2b661005	Merge Extend and Shift into a UBFX git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216899 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 09:33:56 +00:00
Hal Finkel	3da41a28a1	Revert "Revert '[DAGCombiner] Split up an indexed load if only the base pointer value is live'" I reverted r208640 in r209747 because r208640 broke self-hosting on PPC64. The underlying cause of the failure is that pre-inc loads with increments represented by ISD::TargetConstants were being transformed into ISD:::ADDs with ISD::TargetConstant operands. PPC doesn't have a pattern for those, and so they were selected as invalid r+r adds. This recommits r208640, rebased and with an exclusion for ISD::TargetConstant increments. This behavior seems correct, although in the future we might want to ask the target to split out the indexing that uses ISD::TargetConstants. Unfortunately, I don't yet have small test case where the relevant invalid 'add' instruction is not itself dead (and thus eliminated by DeadMachineInstructionElim -- sometimes bugpoint is too good at removing things) Original commit message (by Adam Nemet): Right now the load may not get DCE'd because of the side-effect of updating the base pointer. This can happen if we lower a read-modify-write of an illegal larger type (e.g. i48) such that the modification only affects one of the subparts (the lower i32 part but not the higher i16 part). See the testcase. In order to spot the dead load we need to revisit it when SimplifyDemandedBits decided that the value of the load is masked off. This is the CommitTargetLoweringOpt piece. I checked compile time with ARM64 by sending SPEC bitcode files through llc. No measurable change. Fixes <rdar://problem/16031651> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216898 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-02 06:24:04 +00:00
Jingyue Wu	88350bf61d	Fix a typo in comments in r216862, NFC PR20766 -> PR20776. Thanks Roman Divacky for the catch! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216883 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-01 14:55:04 +00:00
Tilmann Scheller	4016a9ea4a	[ARM] Add Thumb-2 code size optimization regression test for EOR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216881 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-01 12:59:34 +00:00
Tilmann Scheller	9e6f09d7ce	ARM] Add Thumb-2 code size optimization regression test for BIC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216880 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-01 12:53:29 +00:00
Jingyue Wu	d43e6df10b	[MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. Test Plan: Added a NVPTX codegen test to verify that our change is in effect. It also shows the unnecessary register pressure caused by over-sinking. Updated affected tests in AArch64 and X86. Reviewers: eliben, meheff, Jiangning Reviewed By: Jiangning Subscribers: jholewinski, aemerson, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D4814 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216862 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-01 03:47:25 +00:00
Juergen Ributzka	bcbae3d680	Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR." I think this broke the build bot. Reverting it for now until I have time to take a closer look. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216813 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-30 06:16:26 +00:00
Juergen Ributzka	4e92383b67	[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR. Select the correct register class for the various instructions that are generated when combining instructions and constrain the registers to the appropriate register class. This fixes rdar://problem/18183707. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216805 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 23:48:09 +00:00
Juergen Ributzka	e7f301e079	[FastISel][AArch64] Use the correct register class for branches. Also constrain the register class for branches. This fixes rdar://problem/18181496. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216804 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 23:48:06 +00:00
Juergen Ributzka	bc420a0cc1	[MachineSinking] Clear kill flag of all operands at all their uses. When sinking an instruction it might be moved past the original last use of one of its operands. This last use has the kill flag set and the verifier will obviously complain about this. Before Machine Sinking (AArch64): %vreg3<def> = ASRVXr %vreg1, %vreg2<kill> %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ... After Machine Sinking: %XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def> ... %vreg3<def> = ASRVXr %vreg1, %vreg2<kill> This fix clears all the kill flags in all instruction that use the same operands as the instruction that is being sunk. This fixes rdar://problem/18180996. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216803 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 23:48:03 +00:00
Robin Morisset	217b38e19a	Fix typos in comments, NFC Summary: Just fixing comments, no functional change. Test Plan: N/A Reviewers: jfb Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D5130 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216784 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 21:53:01 +00:00
Reid Kleckner	9436574d1b	musttail: Forward regparms of variadic functions on x86_64 Summary: If a variadic function body contains a musttail call, then we copy all of the remaining register parameters into virtual registers in the function prologue. We track the virtual registers through the function body, and add them as additional registers to pass to the call. Because this is all done in virtual registers, the register allocator usually gives us good code. If the function does a call, however, it will have to spill and reload all argument registers (ew). Forwarding regparms on x86_32 is not implemented because most compilers don't support varargs in 32-bit with regparms. Reviewers: majnemer Subscribers: aemerson, llvm-commits Differential Revision: http://reviews.llvm.org/D5060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216780 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 21:42:08 +00:00
Reid Kleckner	dae28732f4	Verifier: Don't reject varargs callee cleanup functions We've rejected these kinds of functions since r28405 in 2006 because it's impossible to lower the return of a callee cleanup varargs function. However there are lots of legal ways to leave such a function without returning, such as aborting. Today we can leave a function with a musttail call to another function with the correct prototype, and everything works out. I'm removing the verifier check declaring that a normal return from such a function is UB. Reviewed By: nlewycky Differential Revision: http://reviews.llvm.org/D5059 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216779 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 21:25:28 +00:00
Louis Gerbarg	6393b3a677	Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values This patch checks for DAG patterns that are an add or a sub followed by a compare on 16 and 8 bit inputs. Since AArch64 does not support those types natively they are legalized into 32 bit values, which means that mask operations are inserted into the DAG to emulate overflow behaviour. In many cases those masks do not change the result of the processing and just introduce a dependent operation, often in the middle of a hot loop. This patch detects the relevent DAG patterns and then tests to see if the transforms are equivalent with and without the mask, removing the mask if possible. The exact mechanism of this patch was discusses in http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html There is a reasonably good chance there are missed oppurtunities due to similiar (but not identical) DAG patterns that could be funneled into this test, adding them should be simple if we see test cases. Tests included. rdar://13754426 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216776 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 21:00:22 +00:00
Reid Kleckner	1469e29334	X86: Fix conflict over ESI between base register and rep;movsl The new solution is to not use this lowering if there are any dynamic allocas in the current function. We know up front if there are dynamic allocas, but we don't know if we'll need to create stack temporaries with large alignment during lowering. Conservatively assume that we will need such temporaries. Reviewed By: hans Differential Revision: http://reviews.llvm.org/D5128 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216775 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 20:50:31 +00:00
Juergen Ributzka	d8835d09ec	[FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc. When we select a trunc instruction we don't emit any code if the type is already i32 or smaller. This is because the instruction that uses the truncated value will deal with it. This behavior can incorrectly transfer a kill flag, which was meant for the result of the truncate, onto the source register. %2 = trunc i32 %1 to i16 ... = ... %2 -> ... = ... vreg1 <kill> ... = ... %1 ... = ... vreg1 This commit fixes this by emitting a COPY instruction, so that the result and source register are distinct virtual registers. This fixes rdar://problem/18178188. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216750 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 17:58:16 +00:00
Tilmann Scheller	59758c4337	[ARM] Add Thumb-2 code size optimization test for ASR (register). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216746 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 17:19:00 +00:00
Tilmann Scheller	b1424d72ca	[ARM] Add Thumb-2 code size optimization test for ASR (immediate). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216744 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 17:02:28 +00:00
Matt Arsenault	f4d57e7874	R600/SI: Use mad for fsub + fmul We can use a negate source modifier to match this for fsub. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216735 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 16:01:14 +00:00
Tim Northover	1e77dc84c4	AArch64: only try to get operand of a known node. A bug in r216725 meant we tried to discover the type of a SETCC before confirming the node actually was a SETCC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216734 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 15:34:58 +00:00
Jingyue Wu	87a2b36cf6	[NVPTX] Make the alignment an explicit argument to ldu/ldg Summary: Instead of specifying the alignment as metadata which may be destroyed by transformation passes, make the alignment the second argument to ldu/ldg intrinsic calls. Test Plan: ldu-ldg.ll ldu-i8.ll ldu-reg-plus-offset.ll Reviewers: eliben, meheff, jholewinski Reviewed By: meheff, jholewinski Subscribers: jholewinski, llvm-commits Differential Revision: http://reviews.llvm.org/D5093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216731 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 15:30:20 +00:00
Tilmann Scheller	c5484a2704	[ARM] Make Thumb-2 code size optimization test more strict. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 15:13:35 +00:00
Tilmann Scheller	f238c1844e	[ARM] Add a first test for the Thumb-2 code size optimization pass. While working on a Thumb-2 code size optimization I just realized that we don't have any regression tests for it. So here's a first test case, I plan to increase the coverage over time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216728 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 15:04:40 +00:00
Tim Northover	1f70cb9c14	AArch64: skip select/setcc combine in complex case. In an llvm-stress generated test, we were trying to create a v0iN type and asserting when that failed. This case could probably be handled by the function, but not without added complexity and the situation it arises in is sufficiently odd that there's probably no benefit anyway. Should fix PR20775. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216725 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 13:05:18 +00:00
Robert Khasanov	37e671e894	[SKX] Enable lowering of integer CMP operations. Added new types to Legalizer. Fixed getSetCCResultType function Added lowering tests. Reviewed by Elena Demikhovsky. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216717 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 08:46:04 +00:00
Job Noorman	d2323fc295	Do not assume the value passed to memset is an i32. The code in SelectionDAG::getMemset for some reason assumes the value passed to memset is an i32. This breaks the generated code for targets that only have registers smaller than 32 bits because the value might get split into multiple registers by the calling convention. See the test for the MSP430 target included in the patch for an example. This patch ensures that nothing is assumed about the type of the value. Instead, the type is taken from the selected overload of the llvm.memset intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216716 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 08:23:53 +00:00
Jiangning Liu	3cd73a5ded	[AArch64] Fix some failures exposed by value type v4f16 and v8f16. 1) Add some missing bitcast patterns for v8f16. 2) Add type promotion for operand of ld/st operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 01:31:42 +00:00
Juergen Ributzka	cf45151b2c	[FastISel][AArch64] Don't fold instructions that are not in the same basic block. This fix checks first if the instruction to be folded (e.g. sign-/zero-extend, or shift) is in the same machine basic block as the instruction we are folding into. Not doing so can result in incorrect code, because the value might not be live-out of the basic block, where the value is defined. This fixes rdar://problem/18169495. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216700 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-29 00:19:21 +00:00
Jim Grosbach	0d34b1ed26	AArch64: More correctly constrain target vector extend lowering. The AArch64 target lowering for [zs]ext of vectors is set up to handle input simple types and expects the generic SDag path to do something reasonable with anything that's not a simple type. The code, however, was only checking that the result type was a simple type and assuming that implied that the source type would also be a simple type. That's not a valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>" demonstrate. The fix is to simply explicitly validate the source type as well as the result type. PR20791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216689 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 22:08:28 +00:00
Rafael Espindola	4bb535027e	On MachO, don't put non-private constants in mergeable sections. On MachO, putting a symbol that doesn't start with a 'L' or 'l' in one of the __TEXT,__literal* sections prevents the linker from merging the context of the section. Since private GVs are the ones the get mangled to start with 'L' or 'l', we now only put those on the __TEXT,__literal* sections. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216682 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 20:13:31 +00:00
Sanjay Patel	cf9661c6f9	Fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472). Remove a block of code from LowerSIGN_EXTEND_INREG() that was added with: http://llvm.org/viewvc/llvm-project?view=revision&revision=177421 And caused: http://llvm.org/bugs/show_bug.cgi?id=20472 (more analysis here) http://llvm.org/bugs/show_bug.cgi?id=18054 The testcases confirm that we (1) don't remove a zext op that is necessary and (2) generate a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows folding of the load, and so saves 3 bytes overall. Differential Revision: http://reviews.llvm.org/D4909 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216679 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 18:59:22 +00:00
David Xu	5ca793561e	Generate CMN when comparing a short int with minus git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216651 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 04:59:53 +00:00
Chandler Carruth	da3a293313	[x86] Clean up some tests to use FileCheck and combine two into a single file. Changing code that is covered by these tests is just too hard to debug currently, and now it will be clear the nature of the changes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216643 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 03:41:28 +00:00
Juergen Ributzka	4a76317ebb	[FastISel] Undo phi node updates when falling-back to SelectionDAG. The included test case would fail, because the MI PHI node would have two operands from the same predecessor. This problem occurs when a switch instruction couldn't be selected. This happens always, because there is no default switch support for FastISel to begin with. The problem was that FastISel would first add the operand to the PHI nodes and then fall-back to SelectionDAG, which would then in turn add the same operands to the PHI nodes again. This fix removes these duplicate PHI node operands by reseting the PHINodesToUpdate to its original state before FastISel tried to select the instruction. This fixes <rdar://problem/18155224>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216640 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 02:06:55 +00:00
Juergen Ributzka	d24494d672	[FastISel] Currently instructions are folded very aggressively for AArch64 into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. This fix teaches hasTrivialKill to not only check the LLVM IR that the value has a single use, but also to check if the register that represents that value has already been used. This can happen when the instruction with the use was folded into another instruction (in this particular case a load instruction). This fixes rdar://problem/18142857. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216634 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-28 00:09:46 +00:00
Juergen Ributzka	a26b1bdcc8	Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation." Quentin pointed out that this is not the correct approach and there is a better and easier solution. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216632 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 23:09:40 +00:00
Juergen Ributzka	1f5263e43f	[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation. Currently instructions are folded very aggressively into the memory operation, which can lead to the use of killed operands: %vreg1<def> = ADDXri %vreg0<kill>, 2 %vreg2<def> = LDRBBui %vreg0, 2 ... = ... %vreg1 ... This usually happens when the result is also used by another non-memory instruction in the same basic block, or any instruction in another basic block. If the computed address is used by only memory operations in the same basic block, then it is safe to fold them. This is because all memory operations will fold the address computation and the original computation will never be emitted. This fixes rdar://problem/18142857. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216629 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 22:52:33 +00:00
Juergen Ributzka	ccf53013cd	[FastISel][AArch64] Fix simplify address when the address comes from a shift. When the address comes directly from a shift instruction then the address computation cannot be folded into the memory instruction, because the zero register is not available as a base register. Simplify addess needs to emit the shift instruction and use the result as base register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216621 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 21:38:33 +00:00
Juergen Ributzka	d445e4acdb	[FastISel][AArch64] Use the zero register for stores. Use the zero register directly when possible to avoid an unnecessary register copy and a wasted register at -O0. This also uses integer stores to store a positive floating-point zero. This saves us from materializing the positive zero in a register and then storing it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216617 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 21:04:52 +00:00
Oliver Stannard	5e487f8dc7	Teach the AArch64 backend about v4f16 and v8f16 This teaches the AArch64 backend to deal with the operations required to deal with the operations on v4f16 and v8f16 which are exposed by NEON intrinsics, plus the add, sub, mul and div operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 16:16:04 +00:00
Chandler Carruth	7e3dc40fab	[x86] Fix a regression introduced with r213897 for 32-bit targets where we stopped efficiently lowering sextload using the SSE41 instructions for that operation. This is a consequence of a bad predicate I used thinking of the memory access needs. The code actually handles the cases where the predicate doesn't apply, and handles them much better. =] Simple fix and a test case added. Fixes PR20767. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216538 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 11:39:47 +00:00
Chandler Carruth	963a5e6c61	[SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine. This combine is essentially combining target-specific nodes back into target independent nodes that it "knows" will be combined yet again by a target independent DAG combine into a different set of target-independent nodes that are legal (not custom though!) and thus "ok". This seems... deeply flawed. The crux of the problem is that we don't combine un-legalized shuffles that are introduced by legalizing other operations, and thus we don't see a very profitable combine opportunity. So the backend just forces the input to that combine to re-appear. However, for this to work, the conditions detected to re-form the unlegalized nodes must be exactly right. Previously, failing this would have caused poor code (if you're lucky) or a crasher when we failed to select instructions. After r215611 we would fall back into the legalizer. In some cases, this just "fixed" the crasher by produces bad code. But in the test case added it caused the legalizer and the dag combiner to iterate forever. The fix is to make the alignment checking in the x86 side of things match the alignment checking in the generic DAG combine exactly. This isn't really a satisfying or principled fix, but it at least make the code work as intended. It also highlights that it would be nice to detect the availability of under aligned loads for a given type rather than bailing on this optimization. I've left a FIXME to document this. Original commit message for r215611 which covers the rest of the chang: [SDAG] Fix a case where we would iteratively legalize a node during combining by replacing it with something else but not re-process the node afterward to remove it. In a truly remarkable stroke of bad luck, this would (in the test case attached) end up getting some other node combined into it without ever getting re-processed. By adding it back on to the worklist, in addition to deleting the dead nodes more quickly we also ensure that if it stops being dead for any reason it makes it back through the legalizer. Without this, the test case will end up failing during instruction selection due to an and node with a type we don't have an instruction pattern for. It took many million runs of the shuffle fuzz tester to find this. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216537 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 11:22:16 +00:00
Elena Demikhovsky	fe0c6ead85	AVX-512: Added intrinsic for VMOVSS store form with mask. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216530 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 07:38:43 +00:00
Juergen Ributzka	fc03e72b4f	[FastISel][AArch64] Fix address simplification. When a shift with extension or an add with shift and extension cannot be folded into the memory operation, then the address calculation has to be materialized separately. While doing so the code forgot to consider a possible sign-/zero- extension. This fix folds now also the sign-/zero-extension into the add or shift instruction which is used to materialize the address. This fixes rdar://problem/18141718. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216511 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 00:58:30 +00:00
Juergen Ributzka	836f4bd090	[FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216510 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-27 00:58:26 +00:00
Yi Kong	2282afa6cc	ARM: Add patterns for dbg git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216451 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-26 12:47:26 +00:00
Chad Rosier	373fc00835	[AArch32] Add patterns for VCVT{A,N,P,M}. Patterns for lowering libm calls to VCVT{A,N,P,M} are also included. Phabricator Revision: http://reviews.llvm.org/D5033 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216388 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-25 16:56:33 +00:00
Hal Finkel	7ca2a7d742	[PowerPC] Add support for dcbtst and icbt (prefetch) Adds code generation support for dcbtst (data cache prefetch for write) and icbt (instruction cache prefetch for read - Book E cores only). We still end up with a 'cannot select' error for the non-supported prefetch intrinsic forms. This will be fixed in a later commit. Fixes PR20692. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216339 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-23 23:21:04 +00:00
Chad Rosier	8eb867e97d	Revert "ARM: improve RTABI 4.2 conformance on Linux" This reverts commit r215862 due to nightly failures. Will work on getting a reduced test case, but I wanted to get our bots green in the meantime. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216325 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-23 18:29:43 +00:00
Chandler Carruth	bfed08e41f	[x86] Start fixing a really subtle and terrible form of miscompile in these DAG combines. The DAG auto-CSE thing is truly terrible. Due to it, when RAUW-ing a node with its operand, you can cause its uses to CSE to itself, which then causes their uses to become your uses which causes them to be picked up by the RAUW. For nodes that are determined to be "no-ops", this is "fine". But if the RAUW is one of several steps to enact a transformation, this causes the DAG to really silently eat an discard nodes that you would never expect. It took days for me to actually pinpoint a test case triggering this and a really frustrating amount of time to even comprehend the bug because I never even thought about the ability of RAUW to iteratively consume nodes due to CSE-ing them into itself. To fix this, we have to build up a brand-new chain of operations any time we are combining across (potentially) intervening nodes. But once the logic is added to do this, another issue surfaces: CombineTo eagerly deletes the one node combined, but no others. This is... really frustrating. If deleting it makes its operands become dead, those operand nodes often won't go onto the worklist in the order you would want -- they're already on it and not near the top. That means things higher on the worklist will get combined prior to these dead nodes being GCed out of the worklist, and if the chain is long, the immediate users won't be enough to re-detect where the root of the chain is that became single-use again after deleting the dead nodes. The better way to do this is to never immediately delete nodes, and instead to just enqueue them so we can recursively delete them. The combined-from node is typically not on the worklist anyways by virtue of having been popped off.... But that in turn breaks other tests that require CombineTo to delete unused nodes. :: sigh :: Fortunately, there is a better way. This whole routine should have been returning the replacement rather than using CombineTo which is quite hacky. Switch to that, and all the pieces fall together. I suspect the same kind of miscompile is possible in the half-shuffle folding code, and potentially the recursive folding code. I'll be switching those over to a pattern more like this one for safety's sake even though I don't immediately have any test cases for them. Note that the only way I got a test case for this instance was with heavily DAG combined 256-bit shuffle sequences generated by my fuzzer. ;] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216319 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-23 10:25:15 +00:00
Nick Lewycky	f591b9c33e	Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216307 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-23 00:45:03 +00:00
Reid Kleckner	d89c0abc07	ARM / x86_64 varargs: Don't save regparms in prologue without va_start There's no need to do this if the user doesn't call va_start. In the future, we're going to have thunks that forward these register parameters with musttail calls, and they won't need these spills for handling va_start. Most of the test suite changes are adding va_start calls to existing tests to keep things working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216294 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-22 21:59:26 +00:00

1 2 3 4 5 ...

11556 Commits