llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-08-23 02:29:18 +00:00

Author	SHA1	Message	Date
Bill Schmidt	03b9f9f2b6	[PowerPC 2/4] Little-endian adjustments for VSX insert/extract operations For little endian, we need to make some straightforward adjustments in the code expansions for scalar_to_vector and vector_extract of v2f64. First, scalar_to_vector must place the scalar into vector element zero. However, our implementation of SUBREG_TO_REG will place it into big-element vector element zero (high-order bits), and for little endian we need it in the low-order bits. The LE implementation splats the high-order doubleword into the low-order doubleword. Second, the meaning of (vector_extract x, 0) and (vector_extract x, 1) must be reversed for similar reasons. A new test is added that tests code generation for insertelement and extractelement for both element 0 and element 1. It is disabled in this patch but enabled in patch 4/4, for reasons stated in the test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223788 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:43:32 +00:00
Robert Khasanov	c50f9f15f5	[AVX512] Added VPBROADCAST{BWDQ} (Load with Broadcast Integer Data from General Purpose Register) encodings for AVX512-BW/VL subsets Added encoding tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223787 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:38:41 +00:00
Juergen Ributzka	b49ee78320	[CodeGenPrepare] Split branch conditions into multiple conditional branches. This optimization transforms code like: bb1: %0 = icmp ne i32 %a, 0 %1 = icmp ne i32 %b, 0 %or.cond = or i1 %0, %1 br i1 %or.cond, label %TrueBB, label %FalseBB into a multiple branch instructions like: bb1: %0 = icmp ne i32 %a, 0 br i1 %0, label %TrueBB, label %bb2 bb2: %1 = icmp ne i32 %b, 0 br i1 %1, label %TrueBB, label %FalseBB This optimization is already performed by SelectionDAG, but not by FastISel. FastISel cannot perform this optimization, because it cannot generate new MachineBasicBlocks. Performing this optimization at CodeGenPrepare time makes it available to both - SelectionDAG and FastISel - and the implementation in SelectiuonDAG could be removed. There are currenty a few differences in codegen for X86 and PPC, so this commmit only enables it for FastISel. Reviewed by Jim Grosbach This fixes rdar://problem/19034919. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223786 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:36:13 +00:00
Juergen Ributzka	50f10eb432	Move function to obtain branch weights into the BranchInst class. NFC. Make this function available to other parts of LLVM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223784 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:36:06 +00:00
Bill Schmidt	b900895384	[PowerPC 1/4] Little-endian adjustments for VSX loads/stores This patch addresses the inherent big-endian bias in the lxvd2x, lxvw4x, stxvd2x, and stxvw4x instructions. These instructions load vector elements into registers left-to-right (with the first element loaded into the high-order bits of the register), regardless of the endian setting of the processor. However, these are the only vector memory instructions that permit unaligned storage accesses, so we want to use them for little-endian. To make this work, a lxvd2x or lxvw4x is replaced with an lxvd2x followed by an xxswapd, which swaps the doublewords. This works for lxvw4x as well as lxvd2x, because for lxvw4x on an LE system the vector elements are in LE order (right-to-left) within each doubleword. (Thus after lxvw2x of a <4 x float> the elements will appear as 1, 0, 3, 2. Following the swap, they will appear as 3, 2, 0, 1, as desired.) For stores, an stxvd2x or stxvw4x is replaced with an stxvd2x preceded by an xxswapd. Introduction of extra swap instructions provides correctness, but obviously is not ideal from a performance perspective. Future patches will address this with optimizations to remove most of the introduced swaps, which have proven effective in other implementations. The introduction of the swaps is performed during lowering of LOAD, STORE, INTRINSIC_W_CHAIN, and INTRINSIC_VOID operations. The latter are used to translate intrinsics that specify the VSX loads and stores directly into equivalent sequences for little endian. Thus code that uses vec_vsx_ld and vec_vsx_st does not have to be modified to be ported from BE to LE. We introduce new PPCISD opcodes for LXVD2X, STXVD2X, and XXSWAPD for use during this lowering step. In PPCInstrVSX.td, we add new SDType and SDNode definitions for these (PPClxvd2x, PPCstxvd2x, PPCxxswapd). These are recognized during instruction selection and mapped to the correct instructions. Several tests that were written to use -mcpu=pwr7 or pwr8 are modified to disable VSX on LE variants because code generation changes with this and subsequent patches in this set. I chose to include all of these in the first patch than try to rigorously sort out which tests were broken by one or another of the patches. Sorry about that. The new test vsx-ldst-builtin-le.ll, and the changes to vsx-ldst.ll, are disabled until LE support is enabled because of breakages that occur as noted in those tests. They are re-enabled in patch 4/4. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223783 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:35:51 +00:00
Rafael Espindola	8a48c86a5f	Move method out of line to make buildbot happy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223781 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:18:11 +00:00
Rafael Espindola	4f20ed1c36	Don't lookup an object symbol name in the module. Instead, walk the obj symbol list in parallel to find the GV. This shouldn't change anything on ELF where global symbols are not mangled, but it is a step toward supporting other object formats. Gold itself is ELF only, but bfd ld supports COFF and the logic in the gold plugin could be reused on lld. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223780 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 16:13:59 +00:00
Chandler Carruth	de0cdb0890	[x86] Fix the test to actually test things for the CPU names, add the missing barcelona CPU which that test uncovered, and remove the 32-bit x86 CPUs which I really wasn't prepared to audit and test thoroughly. If anyone wants to clean up the 32-bit only x86 CPUs, go for it. Also, if anyone else wants to try to de-duplicate the AMD CPUs, that'd be cool, but from the looks of it wouldn't save as much as it did for the Intel CPUs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223774 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 14:25:55 +00:00
Aaron Ballman	93fac96d4b	Removing an unused variable to silence a -Wunused-but-set-variable warning. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223773 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 13:20:11 +00:00
Asiri Rathnayake	838ec33e0c	Fix modified immediate bug reported by MC Hammer. Instructions of the form [ADD Rd, pc, #imm] are manually aliased in processInstruction() to use ADR. To accomodate this, mod_imm handling had to be tweaked a bit. Turns out it was the manual aliasing that must be tweaked to accommodate mod_imms instead. More information about the parsed instruction is available at the point where processInstruction() is invoked, which makes it easier to detect a mod_imm at that point rather than trying to detect a potential alias when a mod_imm is being prepped. Added a test case and fixed some white spaces as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223772 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 13:14:58 +00:00
Chandler Carruth	8729843d31	[x86] Bring some sanity to the x86 CPU processor definitions. Notably, this adds simple micro-architecture names for the Intel CPU variants, and defines the old 'core'-based names as aliases. GCC has started to simplify their documented interface to use these names as well, so it seems like we can start to converge on a consistent pattern. I'd appreciate Intel double checking the entries that aren't yet documented widely, especially Atom (Bonnell and Silvermont), Knights Landing, and Skylake. But this change shouldn't break any existing users. Also, ran clang-format to re-format this code and it actually worked (modulo a tiny bug) so hopefully we can start to stop thinking about formatting this stuff. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223769 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 10:58:36 +00:00
Chandler Carruth	e78a87b633	Teach instcombine to canonicalize "element extraction" from a load of an integer and "element insertion" into a store of an integer into actual element extraction, element insertion, and vector loads and stores. Previously various parts of LLVM (including instcombine itself) would introduce integer loads and stores into the code as a way of opaquely loading and storing "bits". In some cases (such as a memcpy of std::complex<float> object) we will eventually end up using those bits in non-integer types. In order for SROA to effectively promote the allocas involved, it splits these "store a bag of bits" integer loads and stores up into the constituent parts. However, for non-alloca loads and tsores which remain, it uses integer math to recombine the values into a large integer to load or store. All of this would be "fine", except that it forces LLVM to go through integer math to combine and split up values. While this makes perfect sense for integers (and in fact is critical for bitfields to end up lowering efficiently) it is terrible for non-integer types, especially floating point types. We have a much more canonical way of representing the act of concatenating the bits of two SSA values in LLVM: a vector and insertelement. This patch teaching InstCombine to use this representation. With this patch applied, LLVM will no longer introduce integer math into the critical path of every loop over std::complex<float> operations such as those that make up the hot path of ... oh, most HPC code, Eigen, and any other heavy linear algebra library. For the record, I looked extensively at fixing this in other parts of the compiler, but it just doesn't work: - We really do want to canonicalize memcpy and other bit-motion to integer loads and stores. SSA values are tremendously more powerful than "copy" intrinsics. Not doing this regresses massive amounts of LLVM's scalar optimizer. - We really do need to split up integer loads and stores of this form in SROA or every memcpy of a trivially copyable struct will prevent SSA formation of the members of that struct. It essentially turns off SROA. - The closest alternative is to actually split the loads and stores when partitioning with SROA, but this has all of the downsides historically discussed of splitting up loads and stores -- the wide-store information is fundamentally lost. We would also see performance regressions for bitfield-heavy code and other places where the integers aren't really intended to be split without seemingly arbitrary logic to treat integers totally differently. - We can effectively fix this in instcombine, so it isn't that hard of a choice to make IMO. Differential Revision: http://reviews.llvm.org/D6548 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223764 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 08:55:32 +00:00
Michael Ilseman	ebaf61e498	Skip declarations in the case of functions. This is a revert of r223521 in spirit, if not in content. I am not sure why declarations ended up in LazilyLinkGlobalValues in the first place; that will take some more investigation. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223763 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 08:20:06 +00:00
Elena Demikhovsky	ff3745b4ff	AVX-512: Added some comments to ERI scalar intrinsics. No functional change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223761 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 07:06:32 +00:00
Owen Anderson	59bf8e81f3	Fix a few instances found in SelectionDAG where we were not handling F16 at parity with F32 and F64. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223760 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 06:50:39 +00:00
Mohit K. Bhakkad	b284d4ae31	test commit (spelling correction) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223758 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 06:31:07 +00:00
Michael Kuperstein	77c1b73211	[X86] Convert esp-relative movs of function arguments into pushes, step 1 This handles the simplest case for mov -> push conversion: 1. x86-32 calling convention, everything is passed through the stack. 2. There is no reserved call frame. 3. Only registers or immediates are pushed, no attempt to combine a mem-reg-mem sequence into a single PUSHmm. Differential Revision: http://reviews.llvm.org/D6503 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223757 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 06:10:44 +00:00
David Majnemer	2959baffd2	Reland r223754 The commit is identical except a reference to `GV' should have been to `GVal'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223756 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 05:56:09 +00:00
David Majnemer	ebddbe8ba6	Revert "AsmParser: Reject invalid mismatch between forward ref and def" This reverts commit r223754. I've upset the buildbots. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223755 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 05:50:11 +00:00
David Majnemer	82258b45e4	AsmParser: Reject invalid mismatch between forward ref and def Don't assume that the forward referenced entity was of the same global-kind as the new entity. This fixes PR21779. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223754 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 05:43:56 +00:00
Bill Schmidt	13dd854d8c	Restore r223709 as it was meant to be, and enable FeatureP8Vector for P8 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223751 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 03:02:48 +00:00
NAKAMURA Takumi	3d6e1eeeb2	Revert r223709, "[PowerPC]Activate FeatureVSX for the Power target", to unbreak bots. CodeGen/PowerPC/vsx-p8.ll was failing. '+power8-vector' is not a recognized feature for this target (ignoring feature) llvm/test/CodeGen/PowerPC/vsx-p8.ll:33:14: error: expected string not found in input ; CHECK-REG: lxvw4x 34, 0, 3 ^ <stdin>:50:2: note: scanning from here .align 3 ^ <stdin>:61:2: note: possible intended match here lvx 3, 0, 3 ^ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223729 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 01:03:27 +00:00
Hal Finkel	014b06e7b2	Handle early-clobber registers in the aggressive anti-dep breaker The aggressive anti-dep breaker, used by the PowerPC backend during post-RA scheduling (but is available to all targets), did not handle early-clobber MI operands (at all). When constructing the list of available registers for the replacement of some def operand, check the using instructions, and remove registers assigned to early-clobbered defs from the set. Fixes PR21452. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223727 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 01:00:59 +00:00
Tom Stellard	9c276c7ab6	R600/SI: Set MayStore = 0 on MUBUF loads git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223722 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 00:03:54 +00:00
Tom Stellard	781a7ae1ac	R600/SI: Move setting of the lds bit to the base MUBUF class git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223721 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-09 00:03:51 +00:00
Colin LeMahieu	a3b01e5189	[Hexagon] Removing old def versions and replacing usages with versions that have encodings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223720 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 23:55:43 +00:00
Tom Stellard	653ef32216	MISched: Fix moving stores across barriers This fixes an issue with ScheduleDAGInstrs::buildSchedGraph where stores without an underlying object would not be added as a predecessor to the current BarrierChain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223717 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 23:36:48 +00:00
Colin LeMahieu	73ed2dcdac	[Hexagon] Adding any8, all8, and/or/xor/andn/orn/not predicate register forms, mask, and vitpack instructions and patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223710 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 23:07:59 +00:00
Bill Seurer	ac6e0c82ed	[PowerPC]Activate FeatureVSX for the Power target This change activates FeatureVSX for Power 7 and Power 8 in PPC.td. http://reviews.llvm.org/D6570 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223709 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 23:07:12 +00:00
Hal Finkel	b849e04d2b	[PowerPC] Don't use a non-allocatable register to implement the 'cc' alias GCC accepts 'cc' as an alias for 'cr0', and we need to do the same when processing inline asm constraints. This had previously been implemented using a non-allocatable register, named 'cc', that was listed as an alias of 'cr0', but the infrastructure does not seem to support this properly (neither the register allocator nor the scheduler properly accounts for the alias). Instead, we can just process this as a naming alias inside of the inline asm constraint-processing code, so we'll do that instead. There are two regression tests, one where the post-RA scheduler did the wrong thing with the non-allocatable alias, and one where the register allocator did the wrong thing. Fixes PR21742. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223708 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 22:54:22 +00:00
Colin LeMahieu	27fbb34173	[Hexagon] Adding xtype doubleword add, sub, and, or, xor and patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223702 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 22:19:14 +00:00
Colin LeMahieu	9804956609	[Hexagon] Adding xtype doubleword comparisons. Removing unused multiclass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223701 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 21:56:47 +00:00
Colin LeMahieu	7b9be18636	[Hexagon] Adding xtype parity, min, minu, max, maxu instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223693 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 21:19:18 +00:00
Colin LeMahieu	a321bd4f19	[Hexagon] Adding xtype halfword add/sub ll/hl/lh/hh/sat/<<16 instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223692 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 20:33:01 +00:00
Matt Arsenault	dbd00bf51a	R600/SI: Move continue after checking s_mov_b32. There's nothing else to bother trying to shrink these. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223686 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 19:55:43 +00:00
David Majnemer	8eef439528	ConstantFold: Zero-sized globals might land on top of another global A zero sized array is zero sized and might share its address with another global. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223684 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 19:35:31 +00:00
Rafael Espindola	7fd7effa37	Lazily link GlobalVariables and GlobalAliases. We were already lazily linking functions, but all GlobalValues can be treated uniformly for this. The test updates are to ensure that a given GlobalValue is still linked in. This fixes pr21494. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223681 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:45:16 +00:00
Colin LeMahieu	4772502317	[Hexagon] Adding add/sub with saturation. Removing unused def. Cleaning up shift patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223680 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:33:49 +00:00
David Majnemer	fca9c7b21c	InstSimplify: Try to bring back the rest of r223583 This reverts r223624 with a small tweak, hopefully this will make stage3 equivalent. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223679 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:30:43 +00:00
Bruno Cardoso Lopes	43edafcc07	[CompactUnwind] Fix register encoding logic Fix a compact unwind encoding logic bug which would try to encode more callee saved registers than it should, leading to early bail out in the encoding logic and abusive use of DWARF frame mode unnecessarily. Also remove no-compact-unwind.ll which was testing the wrong thing based on this bug and move it to valid 'compact unwind' tests. Added other few more tests too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223676 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:18:32 +00:00
Rafael Espindola	dcc44c64c1	Don't crash when the key of a comdat is lazily linked. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223673 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:05:48 +00:00
Justin Bogner	70b0751080	InstrProf: An intrinsic and lowering for instrumentation based profiling Introduce the ``llvm.instrprof_increment`` intrinsic and the ``-instrprof`` pass. These provide the infrastructure for writing counters for profiling, as in clang's ``-fprofile-instr-generate``. The implementation of the instrprof pass is ported directly out of the CodeGenPGO classes in clang, and with the followup in clang that rips that code out to use these new intrinsics this ends up being NFC. Doing the instrumentation this way opens some doors in terms of improving the counter performance. For example, this will make it simple to experiment with alternate lowering strategies, and allows us to try handling profiling specially in some optimizations if we want to. Finally, this drastically simplifies the frontend and puts all of the lowering logic in one place. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223672 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 18:02:35 +00:00
Tim Northover	811474b929	AArch64: treat HFAs containing "half" types as blocks too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223669 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 17:54:58 +00:00
Andrea Di Biagio	eafdf26d89	[X86] Improved tablegen patters for matching TZCNT/LZCNT. Teach ISel how to match a TZCNT/LZCNT from a conditional move if the condition code is X86_COND_NE. Existing tablegen patterns only allowed to match TZCNT/LZCNT from a X86cond with condition code equal to X86_COND_E. To avoid introducing extra rules, I added an 'ImmLeaf' definition that checks if the condition code is COND_E or COND_NE. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223668 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 17:47:18 +00:00
Colin LeMahieu	cf2daa3671	[Hexagon] Adding combine reg, reg with predicated forms. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223667 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 17:33:06 +00:00
Colin LeMahieu	5c7adadf6d	[Hexagon] Adding packhl instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223664 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 17:01:18 +00:00
Daniel Sanders	b856112d87	[mips] Add Mips-specific CCIf's for accessing the MipsCCState. NFC. Reviewers: vmedic Reviewed By: vmedic Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D6213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223662 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 15:40:09 +00:00
Andrea Di Biagio	ae16ff1c42	[X86] Improved lowering of packed v8i16 vector shifts by non-constant count. Before this patch, the backend sub-optimally expanded the non-constant shift count of a v8i16 shift into a sequence of two 'movd' plus 'movzwl'. With this patch the backend checks if the target features sse4.1. If so, then it lets the shuffle legalizer deal with the expansion of the shift amount. Example: ;; define <8 x i16> @test(<8 x i16> %A, <8 x i16> %B) { %shamt = shufflevector <8 x i16> %B, <8 x i16> undef, <8 x i32> zeroinitializer %shl = shl <8 x i16> %A, %shamt ret <8 x i16> %shl } ;; Before (with -mattr=+avx): vmovd %xmm1, %eax movzwl %ax, %eax vmovd %eax, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq Now: vpxor %xmm2, %xmm2, %xmm2 vpblendw $1, %xmm1, %xmm2, %xmm1 vpsllw %xmm1, %xmm0, %xmm0 retq git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223660 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 14:36:51 +00:00
Rafael Espindola	968f0454b8	Move the ValueMap lookup inside linkFunctionBody. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223659 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 14:25:26 +00:00
Rafael Espindola	eaa2992eb2	Use range loops. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223658 91177308-0d34-0410-b5e6-96231b3b80d8	2014-12-08 14:20:10 +00:00

1 2 3 4 5 ...

74863 Commits