llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-02-24 12:29:33 +00:00

Author	SHA1	Message	Date
Elena Demikhovsky	9cc691fa05	AVX-512, X86: Added lowering for shift operations for SKX. The other changes in the LowerShift() are not functional, just to make the code more convenient. So, the functional changes for SKX only. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237129 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 13:25:46 +00:00
John Brawn	e38f45effc	[ARM] Use AEABI aligned function variants AEABI defines aligned variants of memcpy etc. that can be faster than the default version due to not having to do alignment checks. When emitting target code for these functions make use of these aligned variants if possible. Also convert memset to memclr if possible. Differential Revision: http://reviews.llvm.org/D8060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237127 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 13:13:38 +00:00
Igor Laevsky	00666a17ff	Reverse ordering of base and derived pointer during safepoint lowering. According to the documentation in StackMap section for the safepoint we should have: "The first Location in each pair describes the base pointer for the object. The second is the derived pointer actually being relocated." But before this change we emitted them in reverse order - derived pointer first, base pointer second. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237126 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 13:12:14 +00:00
Vasileios Kalintiris	98ed1175d9	[mips][FastISel] Handle calls with non legal types i8 and i16. Summary: Allow calls with non legal integer types based on i8 and i16 to be processed by mips fast-isel. Based on a patch by Reed Kotler. Test Plan: "Make check" test forthcoming. Test-suite passes at O0/O2 and with mips32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, rfuhler Differential Revision: http://reviews.llvm.org/D6770 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237121 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 12:29:17 +00:00
Vasileios Kalintiris	d8b1e16033	[mips][FastISel] Simplify callabi.ll by using multiple check prefixes. Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9635 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237119 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 12:17:11 +00:00
Vasileios Kalintiris	4ffee4b853	[mips][FastISel] Allow computation of addresses from constant expressions. Summary: Try to compute addresses when the offset from a memory location is a constant expression. Based on a patch by Reed Kotler. Test Plan: Passes test-suite for -O0/O2 and mips 32 r1/r2 Reviewers: rkotler, dsanders Subscribers: llvm-commits, aemerson, rfuhler Differential Revision: http://reviews.llvm.org/D6767 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237117 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 12:08:31 +00:00
Elena Demikhovsky	cfff317af7	AVX-512: select operation for i1 vectors like: select i1 %cond, <16 x i1> %a, <16 x i1> %b. I added pseudo-CMOV patterns to resolve the "select". Added tests for KNL and SKX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237106 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 09:36:52 +00:00
Michael Kuperstein	9cf6c24660	[X86] DAGCombine should not assume arbitrary vector types are simple The X86-specific DAGCombine for stores should not assume vector types are always simple. This fixes PR23476. Differential Revision: http://reviews.llvm.org/D9659 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237097 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 07:33:07 +00:00
Eric Christopher	0552d51c45	Migrate existing backends that care about software floating point to use the information in the module rather than TargetOptions. We've had and clang has used the use-soft-float attribute for some time now so have the backends set a subtarget feature based on a particular function now that subtargets are created based on functions and function attributes. For the one middle end soft float check go ahead and create an overloadable TargetLowering::useSoftFloat function that just checks the TargetSubtargetInfo in all cases. Also remove the command line option that hard codes whether or not soft-float is set by using the attribute for all of the target specific test cases - for the generic just go ahead and add the attribute in the one case that showed up. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237079 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-12 01:26:05 +00:00
Andrew Kaylor	e04fe6c135	[WinEH] Handle nested landing pads that return directly to the parent function. Differential Revision: http://reviews.llvm.org/D9684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237063 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-11 23:06:02 +00:00
Andrew Kaylor	aeeca679f7	[WinEH] Update exception numbering to give handlers their own base state. Differential Revision: http://reviews.llvm.org/D9512 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237014 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-11 19:41:19 +00:00
Pirama Arumuga Nainar	c9f2596abc	[X86] Updates to X86 backend for f16 promotion Summary: r235215 adds support for f16 to be considered as a load/store type and promote f16 operations to f32. This patch has miscellaneous fixes for the X86 backend so all f16 operations are handled: 1. Set loadextaction for f16 vectors to expand. 2. Handle FP_EXTEND in a switch statement when handling v2f32 3. Do not fold (FP_TO_SINT (load f16)) into FP_TO_INT*_IN_MEM or (store (SINT_TO_FP )) to a FILD. Tests included. Reviewers: ab, srhines, delena Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9092 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237004 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-11 17:14:39 +00:00
Elena Demikhovsky	63df7cd4ea	AVX-512: Changed CC parameter in "cmp" intrinsic from i8 to i32 according to the Intel Spec by Igor Breger (igor.breger@intel.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236979 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-11 09:03:14 +00:00
Elena Demikhovsky	8189eb4d7e	AVX-512: Added SKX instructions and intrinsics: {add/sub/mul/div/} x {ps/pd} x {128/256} 2. max/min with sae By Asaf Badouh (asaf.badouh@intel.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236971 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-11 06:05:05 +00:00
Elena Demikhovsky	c7c44fa75e	AVX-512: fixed UINT_TO_FP operation for 512-bit types. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236955 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-10 14:23:52 +00:00
Elena Demikhovsky	59a0fe6e3f	AVX-512: fixed a bug in i1 vectors lowering git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236947 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-10 10:33:32 +00:00
NAKAMURA Takumi	31e094b55d	llvm/test/CodeGen/AArch64/tailcall_misched_graph.ll: s/REQUIRE/REQUIRES/ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236928 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-09 05:59:00 +00:00
James Y Knight	01d1787830	Fix MergeConsecutiveStore for non-byte-sized memory accesses. The bug showed up as a compile-time assertion failure: Assertion `NumBits >= MIN_INT_BITS && "bitwidth too small"' failed when building msan tests on x86-64. Prior to r236850, this bug was masked due to a bogus alignment check, which also accidentally rejected non-byte-sized accesses. Afterwards, an invalid ElementSizeBytes == 0 got further into the function, and triggered the assertion failure. It would probably be a good idea to allow it to handle merging stores of unusual widths as well, but for now, to un-break it, I'm just making the minimal fix. Differential Revision: http://reviews.llvm.org/D9626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236927 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-09 03:13:37 +00:00
Pete Cooper	f5b930b2e2	[Fast-ISel] Don't mark the first use of a remat constant as killed. When emitting something like 'add x, 1000' if we remat the 1000 then we should be able to mark the vreg containing 1000 as killed. Given that we go bottom up in fast-isel, a later use of 1000 will be higher up in the BB and won't kill it, or be impacted by the lower kill. However, rematerialised constant expressions aren't generated bottom up. The local value save area grows downwards. This means that if you remat 2 constant expressions which both use 1000 then the first will kill it, then the second, which is lower in the BB will read a killed register. This is the case in the attached test where the 2 GEPs both need to generate 'add x, 6680' for the constant offset. Note that this commit only makes kill flag generation conservative. There's nothing else obviously wrong with the local value save area growing downwards, and in fact it needs to for handling arbitrarily complex constant expressions. However, it would be nice if there was a solution which would let us generate more accurate kill flags, or just kill flags completely. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236922 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-09 00:51:03 +00:00
Arnold Schwaighofer	75e36e847e	ScheduleDAGInstrs: In functions with tail calls PseudoSourceValues are not non-aliasing distinct objects The code that builds the dependence graph assumes that two PseudoSourceValues don't alias. In a tail calling function two FixedStackObjects might refer to the same location. Worse 'immutable' fixed stack objects like function arguments are not immutable and will be clobbered. Change this so that a load from a FixedStackObject is not invariant in a tail calling function and don't return a PseudoSourceValue for an instruction in tail calling functions when building the dependence graph so that we handle function arguments conservatively. Fix for PR23459. rdar://20740035 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236916 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 23:52:00 +00:00
Hans Wennborg	262697d9d8	Switch lowering: cluster adjacent fall-through cases even at -O0 It's cheap to do, and codegen is much faster if cases can be merged into clusters. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236905 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 21:23:39 +00:00
Pete Cooper	f9f04c25b2	[Fast-ISel] Clear kill flags on registers replaced by updateValueMap. When selecting an extract instruction, we don't actually generate code but instead work out which register we are reading, and rewrite uses of the extract def to the source register. This is done via updateValueMap,. However, its possible that the source register we are rewriting to to also have uses. If those uses are after a kill of the value we are rewriting from then we have uses after a kill and the verifier fails. This code checks for the case where the to register is also used, and if so it clears all kill on the from register. This is conservative, but better that always clearing kills on the from register. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236897 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 20:46:54 +00:00
Brendon Cahoon	74b576041a	[Hexagon] Generate more hardware loops Refactored parts of the hardware loop pass to generate more. Also, added more tests. Differential Revision: http://reviews.llvm.org/D9568 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236896 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 20:18:21 +00:00
Pete Cooper	c2347d5cf1	[X86] Fast-ISel was incorrectly always killing the source of a truncate. A trunc from i32 to i1 on x86_64 generates an instruction such as %vreg19<def> = COPY %vreg9:sub_8bit<kill>; GR8:%vreg19 GR32:%vreg9 However, the copy here should only have the kill flag on the 32-bit path, not the 64-bit one. Otherwise, we are killing the source of the truncate which could be used later in the program. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236890 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 18:29:42 +00:00
Pat Gavlin	5c7f7462e4	Extend the statepoint intrinsic to allow statepoints to be marked as transitions from GC-aware code to code that is not GC-aware. This changes the shape of the statepoint intrinsic from: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 unused, ...call args, i32 # deopt args, ...deopt args, ...gc args) to: @llvm.experimental.gc.statepoint(anyptr target, i32 # call args, i32 flags, ...call args, i32 # transition args, ...transition args, i32 # deopt args, ...deopt args, ...gc args) This extension offers the backend the opportunity to insert (somewhat) arbitrary code to manage the transition from GC-aware code to code that is not GC-aware and back. In order to support the injection of transition code, this extension wraps the STATEPOINT ISD node generated by the usual lowering lowering with two additional nodes: GC_TRANSITION_START and GC_TRANSITION_END. The transition arguments that were passed passed to the intrinsic (if any) are lowered and provided as operands to these nodes and may be used by the backend during code generation. Eventually, the lowering of the GC_TRANSITION_{START,END} nodes should be informed by the GC strategy in use for the function containing the intrinsic call; for now, these nodes are instead replaced with no-ops. Differential Revision: http://reviews.llvm.org/D9501 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236888 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 18:07:42 +00:00
Pete Cooper	d90099d36c	Clear kill flags on all used registers when sinking instructions. The test here was sinking the AND here to a lower BB: %vreg7<def> = ANDWri %vreg8, 0; GPR32common:%vreg7,%vreg8 TBNZW %vreg8<kill>, 0, <BB#1>; GPR32common:%vreg8 which meant that vreg8 was read after it was killed. This commit changes the code from clearing kill flags on the AND to clearing flags on all registers used by the AND. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236886 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 17:54:32 +00:00
Brendon Cahoon	7fd56b1e4a	[Hexagon] Update AnalyzeBranch, etc target hooks Improved the AnalyzeBranch, InsertBranch, and RemoveBranch functions in order to handle more of our branch instructions. This requires changes to analyzeCompare and PredicateInstructions. Specifically, we've added support for new value compare jumps, improved handling of endloop, added more compare instructions, and improved support for predicate instructions. Differential Revision: http://reviews.llvm.org/D9559 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236876 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 16:16:29 +00:00
Andrea Di Biagio	405e5f276b	[X86] Teach 'getTargetShuffleMask' how to look through ISD::WrapperRIP when decoding a PSHUFB mask. The function 'getTargetShuffleMask' already knows how to deal with PSHUFB nodes where the mask node is a load from constant pool, and the constant pool node is wrapped by a X86ISD::Wrapper node. This patch extends that logic by teaching it how to also look through X86ISD::WrapperRIP. This helps function combineX86ShufflesRecusively to combine more shuffle sequences containing PSHUFB nodes if we are in RIPRel PIC mode. Before this change, llc (with -relocation-model=pic -march=x86-64) was unable to decode a pshufb where the mask was loaded from a constant pool. For example, the no-op shuffle from test 'x86-fold-pshufb.ll' was not folded into its operand, so instead of generating a single 'movaps' the backend always generated a sub-optimal 'movdqa + pshufb' sequence. Added test x86-fold-pshufb.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236863 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 15:11:07 +00:00
James Y Knight	ff3820a182	Fix test added in r236850 for OSX builders. Need to specify triple so that llvm emits the asm syntax that the test expected. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236855 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 14:04:54 +00:00
James Y Knight	e9359f427e	Fix alignment checks in MergeConsecutiveStores. 1) check whether the alignment of the memory is sufficient for the merged store or load to be efficient. Not doing so can result in some ridiculously poor code generation, if merging creates a vector operation which must be aligned but isn't. 2) DON'T check that the alignment of each load/store is equal. If you're merging 2 4-byte stores, the first might have 8-byte alignment, but the second certainly will have 4-byte alignment. We do want to allow those to be merged. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236850 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 13:47:01 +00:00
Vasileios Kalintiris	ca33d72658	[mips] Emit the .insn directive for empty basic blocks. Summary: In microMIPS, labels need to know whether they are on code or data. This is indicated with STO_MIPS_MICROMIPS and can be inferred by being followed by instructions. For empty basic blocks, we can ensure this by emitting the .insn directive after the label. Also, this fixes some failures in our out-of-tree microMIPS buildbots, for the exception handling regression tests under: SingleSource/Regression/C++/EH Reviewers: dsanders Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9530 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236815 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 09:10:15 +00:00
Eric Christopher	7a5d9c0611	Now that we have a soft-float attribute, use it instead of the hard coded command line option for the Mips soft float tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236801 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-08 00:57:22 +00:00
Pete Cooper	e4eff4b231	Clear kill flags in tail duplication. If we duplicate an instruction then we must also clear kill flags on any uses we rewrite. Otherwise we might be killing a register which was used in other BBs. For example, here the entry BB ended up with these instructions, the ADD having been tail duplicated. %vreg24<def> = t2ADDri %vreg10<kill>, 1, pred:14, pred:%noreg, opt:%noreg; GPRnopc:%vreg24 rGPR:%vreg10 %vreg22<def> = COPY %vreg10; GPR:%vreg22 rGPR:%vreg10 The copy here is inserted after the add and so needs vreg10 to be live. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236782 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 21:48:26 +00:00
Pete Cooper	d887b7ef2f	[AArch64] Fix sext/zext folding in address arithmetic. We were accidentally folding a sign/zero extend in to address arithmetic in a different BB when the extend wasn't available there. Cross BB fast-isel isn't safe, so restrict this to only when the extend is in the same BB as the use. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236764 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 19:21:36 +00:00
Nemanja Ivanovic	308873bcb8	Add VSX Scalar loads and stores to the PPC back end This patch corresponds to review: http://reviews.llvm.org/D9440 It adds a new register class to the PPC back end to contain single precision values in VSX registers. Additionally, it adds scalar loads and stores for VSX registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236755 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 18:24:05 +00:00
Diego Novillo	aa46024ea3	Fix information loss in branch probability computation. Summary: This addresses PR 22718. When branch weights are too large, they were being clamped to the range [1, MaxWeightForBB]. But this clamping is only applied to edges that go outside the range, so it distorts the relative branch probabilities. This patch changes the weight calculation to scale every branch so the relative probabilities are preserved. The scaling is done differently now. First, all the branch weights are added up, and if the sum exceeds 32 bits, it computes an integer scale to bring all the weights within the range. The patch fixes an existing test that had slightly wrong branch probabilities due to the previous clamping. It now gets branch weights scaled accordingly. Reviewers: dexonsmith Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D9442 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236750 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 17:22:06 +00:00
Sanjay Patel	39cf555429	[x86] eliminate unnecessary shuffling/moves with unary scalar math ops (PR21507) Finish the job that was abandoned in D6958 following the refactoring in http://reviews.llvm.org/rL230221: 1. Uncomment the intrinsic def for the AVX r_Int instruction. 2. Add missing r_Int entries to the load folding tables; there are already tests that check these in "test/Codegen/X86/fold-load-unops.ll", so I haven't added any more in this patch. 3. Add patterns to solve PR21507 ( https://llvm.org/bugs/show_bug.cgi?id=21507 ). So instead of this: movaps %xmm0, %xmm1 rcpss %xmm1, %xmm1 movss %xmm1, %xmm0 We should now get: rcpss %xmm0, %xmm0 And instead of this: vsqrtss %xmm0, %xmm0, %xmm1 vblendps $1, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm1[0],xmm0[1,2,3] We should now get: vsqrtss %xmm0, %xmm0, %xmm0 Differential Revision: http://reviews.llvm.org/D9504 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236740 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 15:48:53 +00:00
Elena Demikhovsky	d08d0340e5	AVX-512: Added all forms of FP compare instructions for KNL and SKX. Added intrinsics for the instructions. CC parameter of the intrinsics was changed from i8 to i32 according to the spec. By Igor Breger (igor.breger@intel.com) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236714 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 11:24:42 +00:00
NAKAMURA Takumi	a3ded6b432	llvm/test/CodeGen/X86/llc-override-mcpu-mattr.ll: Tweak not to be affected by x64 Calling Convention. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236710 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-07 10:18:28 +00:00
Akira Hatanaka	e6f0494cd8	Let llc and opt override "-target-cpu" and "-target-features" via command line options. This commit fixes a bug in llc and opt where "-mcpu" and "-mattr" wouldn't override function attributes "-target-cpu" and "-target-features" in the IR. Differential Revision: http://reviews.llvm.org/D9537 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236677 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 23:54:14 +00:00
Pete Cooper	28b0dda32e	Handle dead defs in the if converter. We had code such as this: r2 = ... t2Bcc label1: ldr ... r2 label2; return r2<dead, def> The if converter was transforming this to r2<def> = ... return [pred] r2<dead,def> ldr <r2, kill> return which fails the machine verifier because the ldr now reads from a dead def. The fix here detects dead defs in stepForward and passes them back to the caller in the clobbers list. The caller then clears the dead flag from the def is the value is live. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236660 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 22:51:04 +00:00
Pete Cooper	537ff782aa	Fix incorrect kill flags in fastisel. If called twice in the same BB on the same constant, FastISel::fastEmit_ri_ was marking the materialized vreg as killed on each use, instead of only the last use. Change this to only mark the last use as killed by making earlier uses check if the vreg is already used elsewhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236650 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 22:09:29 +00:00
Pete Cooper	0040d179d2	[x86] Fix register class of folded load index reg. When folding a load in to another instruction, we need to fix the class of the index register Otherwise, it could be something like GR64 not GR64_NOSP and would fail the machine verifier. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236644 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 21:37:19 +00:00
Tim Northover	84b8c10729	CodeGen: move over-zealous assert into actual if statement. It's quite possible to encounter an insertvalue instruction that's more deeply nested than the value we're looking for, but when that happens we really mustn't compare beyond the end of the index array. Since I couldn't see any guarantees about what comparisons std::equal makes, we probably need to directly check the size beforehand. In practice, I suspect most std::equal implementations would probably bail early, which would be OK. But just in case... rdar://20834485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236635 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 20:07:38 +00:00
Duncan P. N. Exon Smith	7838051bda	DwarfDebug: Emit number of bytes in .debug_loc entry directly Emit the number of bytes in a `.debug_loc` entry directly. The old code created temp labels (expensive), emitted the difference between them, and then emitted one on each side of the relevant bytes. (I'm looking at `llc` memory usage on `verify-uselistorder.lto.opt.bc` (the optimized version of ld64's `-save-temps` when linking the `verify-uselistorder` executable in an LTO bootstrap). I've hacked `MCContext::Allocate()` to just call `malloc()` instead of using the `BumpPtrAllocator` so that the heap profile is easier to read. As far as peak memory is concerned, `MCContext::Allocate()` is equivalent to a leak, since it only gets freed at process teardown. In my heap profile, this patch drops memory usage of `DwarfDebug::emitDebugLoc()` from 132.56 MB (11.4%) down to 29.86 MB (2.7%) at peak memory. Some of that must be noise from `SmallVector` (or other) allocations -- peak memory only dropped from 1160 MB down to 1100 MB -- but this nevertheless shaves 5% off the top.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236629 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 19:11:20 +00:00
Pawel Bylica	5304314702	Readd the regression test from r236584. Calling convention fixed to linux. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236610 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 16:43:21 +00:00
Pete Cooper	99413f0d40	[ARM] Fast-Isel was incorrectly selecting <2 x double> adds. With neon enabled, we reach SelectBinaryFPOp and are able to get registers for a <2 x double> add. However, we shouldn't actually attempt arithmetic on it as ARMIselLowering says "v2f64 is legal so that QR subregs can be extracted as f64 elements, but neither Neon nor VFP support any arithmetic operations on it." This commit disables SelectBinaryFPOp for any vector types. There's already a FIXME to try handle neon. Doing so would require fixing this conditional which isn't safe for vectors 'VT == MVT::f64 \|\| VT == MVT::i64' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236609 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 16:39:17 +00:00
Bill Schmidt	982f60be44	[PPC64LE] Adjust vector splats during VSX swap optimization The initial code drop for VSX swap optimization permitted the optimization only when all operations in a web of related computation are lane-insensitive. For some lane-sensitive operations, we can still permit the optimization provided that we make adjustments to those operations. This patch adds special handling for vector splats so that their presence doesn't kill the optimization. Vector splats are lane-sensitive since they identify by number a vector element to be used as the source of a splat. When swap optimizations take place, the desired vector element will move to the opposite doubleword of the quadword vector. We thus replace the index I by (I + N/2) % N, where N is the number of elements in the vector. A new test case is added to test that swap optimization succeeds when vector splats are present, and that the proper input element is used as the source of the splat. An ancillary change removes SH_BUILDVEC as one of the kinds of special handling that may be required by VSX swap optimization. From experience with GCC, I had expected to need some modifications for vector build operations, but I did not find that to be the case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236606 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 15:40:46 +00:00
Artyom Skrobov	78fc2103c9	[ARM] generate VMAXNM/VMINNM for a compare followed by a select, in safe math mode too git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236590 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 11:44:10 +00:00
Pawel Bylica	5fa37a7f2d	Revert regression test from r236584. Temporary remove a regression test added in r236584. It fails on Windows. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236586 91177308-0d34-0410-b5e6-96231b3b80d8	2015-05-06 10:41:46 +00:00

1 2 3 4 5 ...

13507 Commits