llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-07-21 18:29:45 +00:00

Author	SHA1	Message	Date
Sanjay Patel	c5992119fc	Enable FeatureFastUAMem for btver2 Allow unaligned 16-byte memop codegen for btver2. No functional changes for any other subtargets. Replace the existing supposed small memcpy test with an actual test of a small memcpy. The previous test wasn't using FileCheck either. This patch should allow us to close PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6360 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222925 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-28 18:40:18 +00:00
Sanjay Patel	28660d4b2f	Add a feature flag for slow 32-byte unaligned memory accesses [x86]. This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen for Sandy Bridge and Ivy Bridge. There is no functionality change intended for those chips. Previously, the absence of AVX2 was being used as a proxy to detect this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2 that do not have the 32-byte unaligned access slowdown. Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ). Differential Revision: http://reviews.llvm.org/D6355 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222544 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 17:40:04 +00:00
Alexey Volkov	d0d0424368	[X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers Differential Revision: http://reviews.llvm.org/D5938 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222521 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-21 11:19:34 +00:00
Alexey Volkov	19e8fe05dc	[X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs Differential Revision: http://reviews.llvm.org/D5934 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222141 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-17 16:17:51 +00:00
Sanjay Patel	e7c966f067	Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385). This is a first step for generating SSE rcp instructions for reciprocal calcs when fast-math allows it. This is very similar to the rsqrt optimization enabled in D5658 ( http://reviews.llvm.org/rL220570 ). For now, be conservative and only enable this for AMD btver2 where performance improves significantly both in terms of latency and throughput. We may never enable this codegen for Intel Core* chips because the divider circuits are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21 cycle critical path for the rcp + mul + sub + mul + add estimate. Follow-on patches may allow configuration of the number of Newton-Raphson refinement steps, add AVX512 support, and enable the optimization for more chips. More background here: http://llvm.org/bugs/show_bug.cgi?id=21385 Differential Revision: http://reviews.llvm.org/D6175 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-11 20:51:00 +00:00
Andrea Di Biagio	c270e8abbe	[X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX and FeatureSSE4A for all the bdver* cpus. This patch adds 'FeatureSlowSHLD' to 'bdver3'. According to the official AMD optimization guide for amdfam15: "Using alternative code in place of SHLD achieves lower overall latency and requires fewer execution resources. The 32-bit and 64-bit forms of ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath instructions, while SHLD is a VectorPath instruction." This patch also explicitly sets feature AVX and SSE4A for all the bdver* cpus. This part of the patch is a non-functional change and it is mainly done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221296 91177308-0d34-0410-b5e6-96231b3b80d8	2014-11-04 21:18:09 +00:00
Sanjay Patel	a46f06efe2	Use rsqrt (X86) to speed up reciprocal square root calcs This is a first step for generating SSE rsqrt instructions for reciprocal square root calcs when fast-math is allowed. For now, be conservative and only enable this for AMD btver2 where performance improves significantly - for example, 29% on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c (if we convert the data type to single-precision float). This patch adds a two constant version of the Newton-Raphson refinement algorithm to DAGCombiner that can be selected by any target via a parameter returned by getRsqrtEstimate().. See PR20900 for more details: http://llvm.org/bugs/show_bug.cgi?id=20900 Differential Revision: http://reviews.llvm.org/D5658 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220570 91177308-0d34-0410-b5e6-96231b3b80d8	2014-10-24 17:02:16 +00:00
Sanjay Patel	a9d7398280	Add a scheduling model for AMD 16H Jaguar (btver2). This is a first pass at a scheduling model for Jaguar. It's structured largely on the existing SandyBridge and SLM sched models. Using this model, in addition to turning on the PostRA scheduler, results in some perf wins on internal and 3rd party benchmarks. There's not much difference in LLVM's test-suite benchmarking subset of tests. Differential Revision: http://reviews.llvm.org/D5229 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217457 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-09 20:07:07 +00:00
Robert Khasanov	3a34f5e115	[x86] Enable Broadwell target. Added FeatureSMAP. Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216161 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-21 09:16:12 +00:00
Kevin Enderby	42deb12738	Add support for the X86 secure guard extensions instructions in assembler (SGX). This allows assembling the two new instructions, encls and enclu for the SKX processor model. Note the diffs are a bigger than what might think, but to fit the new MRM_CF and MRM_D7 in things in the right places things had to be renumbered and shuffled down causing a bit more diffs. rdar://16228228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214460 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-31 23:57:38 +00:00
Robert Khasanov	aac33cfc08	[SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features. Enabling HasAVX512{DQ,BW,VL} predicates. Adding VK2, VK4, VK32, VK64 masked register classes. Adding new types (v64i8, v32i16) to VR512. Extending calling conventions for new types (v64i8, v32i16) Patch by Zinovy Nis <zinovy.y.nis@intel.com> Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213545 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-21 14:54:21 +00:00
Elena Demikhovsky	0780b6db5d	AVX-512: dec/inc instructions are slow on KNL After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC. Added a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212178 91177308-0d34-0410-b5e6-96231b3b80d8	2014-07-02 14:11:05 +00:00
Alexey Volkov	a2bc6951a0	[X86] Use ADD/SUB instead of INC/DEC for Silvermont According to Intel Software Optimization Manual on Silvermont INC or DEC instructions require an additional uop to merge the flags. As a result, a branch instruction depending on an INC or a DEC instruction incurs a 1 cycle penalty. Differential Revision: http://reviews.llvm.org/D3990 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210466 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-09 11:40:41 +00:00
Alexey Volkov	0d0bab5168	[X86] Tune LEA usage for Silvermont According to Intel Software Optimization Manual on Silvermont in some cases LEA is better to be replaced with ADD instructions: "The rule of thumb for ADDs and LEAs is that it is justified to use LEA with a valid index and/or displacement for non-destructive destination purposes (especially useful for stack offset cases), or to use a SCALE. Otherwise, ADD(s) are preferable." Differential Revision: http://reviews.llvm.org/D3826 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209198 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-20 08:55:50 +00:00
Chandler Carruth	905e33545c	[x86] Make the 'x86-64' cpu, what I see as and many use as the generic default architecture for reasonable modern x86 processors, actually be modern. This processor model should essentially be "tuned" for modern x86 chips as much as possible without undue penalties on any specific architecture. Previously we weren't even using the nice scheduling models. There are a few other tweaks needed here, but this change at least I have benchmarked across a decent swatch of chips (intel's clovertown, westmere, and sandybridge; amd's istanbul) and seen no significant regressions. If anyone has suggested ways to test this, just let me know. Somewhat alarmingly, no existing tests failed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208230 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-07 17:37:03 +00:00
Benjamin Kramer	3cddd1607c	Add a description for AMD's bdver4 (aka Excavator). This is just bdver3 + AVX2 + BMI2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207847 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-02 15:47:07 +00:00
Alexey Volkov	177c1ef30d	Enable FeatureFastUAMem for Silvermont processor Differential Revision: http://llvm-reviews.chandlerc.com/D2982 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203218 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-07 09:03:49 +00:00
Alexey Volkov	adaa3e5760	Test commit Removed whitespace git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203216 91177308-0d34-0410-b5e6-96231b3b80d8	2014-03-07 08:28:44 +00:00
Craig Topper	93c9401dff	[x86] Add basic support for .code16 This is not really expected to work right yet. Mostly because we will still emit the OpSize (0x66) prefix in all the wrong places, along with a number of other corner cases. Those will all be fixed in the subsequent commits. Patch from David Woodhouse. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198584 91177308-0d34-0410-b5e6-96231b3b80d8	2014-01-06 04:55:54 +00:00
Rafael Espindola	4a6855441c	Change the default of AsmWriterClassName and isMCAsmWriter. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196065 91177308-0d34-0410-b5e6-96231b3b80d8	2013-12-02 04:55:42 +00:00
Ekaterina Romanova	46f7257ed1	SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction. AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) \| (y >> (64 - c))) when we are not optimizing for size. It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-21 23:21:26 +00:00
Benjamin Kramer	00e3be6134	X86: Add a description for AMD bdver3 aka Steamroller. This is just bdver2 + FSGSBase. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193984 91177308-0d34-0410-b5e6-96231b3b80d8	2013-11-04 10:29:20 +00:00
Yunzhong Gao	cdb9bd7eb9	Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar, bulldozer and piledriver. Support for the instruction itself seems to have already been added in r178040. Differential Revision: http://llvm-reviews.chandlerc.com/D1933 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192828 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-16 19:04:11 +00:00
Nick Lewycky	e66dd40d74	Rename this feature to "cx16" to match gcc's flag name. Apparently these strings are directly tied to the flag names in clang with no remapping in between? git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192044 91177308-0d34-0410-b5e6-96231b3b80d8	2013-10-05 20:11:44 +00:00
Yunzhong Gao	4da61345ec	Adding a feature flag to the llvm backend for x86 TBM instruction set. Adding TBM feature to bdver2 processor; piledriver supports this instruction set according to the following document: http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191324 91177308-0d34-0410-b5e6-96231b3b80d8	2013-09-24 18:21:52 +00:00
Preston Gurd	2ff37c701e	Remove unused code, which had been commented out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190869 91177308-0d34-0410-b5e6-96231b3b80d8	2013-09-17 16:53:36 +00:00
Craig Topper	5fefc00bac	Make F16C feature flag imply AVX rather than just checking both at the patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190775 91177308-0d34-0410-b5e6-96231b3b80d8	2013-09-16 04:29:58 +00:00
Preston Gurd	94dc6540a8	Adds support for Atom Silvermont (SLM) - -march=slm Implements Instruction scheduler latencies for Silvermont, using latencies from the Intel Silvermont Optimization Guide. Auto detects SLM. Turns on post RA scheduler when generating code for SLM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190717 91177308-0d34-0410-b5e6-96231b3b80d8	2013-09-13 19:23:28 +00:00
Ben Langmuir	1f1bd9a54d	Partial support for Intel SHA Extensions (sha1rnds4) Add basic assembly/disassembly support for the first Intel SHA instruction 'sha1rnds4'. Also includes feature flag, and test cases. Support for the remaining instructions will follow in a separate patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190611 91177308-0d34-0410-b5e6-96231b3b80d8	2013-09-12 15:51:31 +00:00
Benjamin Kramer	d7a178eee3	X86: Add a description of the Intel Atom Silvermont CPU. Currently this is just the atom model with SSE4.2 enabled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189669 91177308-0d34-0410-b5e6-96231b3b80d8	2013-08-30 14:05:32 +00:00
Rafael Espindola	4aa8bdaa46	Rename features to match what gcc and clang use. There is no advantage in being different and using the same names simplifies clang a bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189141 91177308-0d34-0410-b5e6-96231b3b80d8	2013-08-23 20:21:34 +00:00
Craig Topper	33b5fe7f16	Rename mattr names for AVX-512 to from avx-512 -> avx512f, avx-512-pfi -> av512pf, avx-512-cdi -> avx512cd, avx-512-eri->avx512er. This matches better with official docs and what gcc patches appearto be using. I didn't touch the has* functions or the feature flag names to avoid change the td and lowering file while commits are still happening. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188859 91177308-0d34-0410-b5e6-96231b3b80d8	2013-08-21 03:57:57 +00:00
Elena Demikhovsky	c18f4efc5d	Added encoding prefixes for KNL instructions (EVEX). Added 512-bit operands printing. Added instruction formats for KNL instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187324 91177308-0d34-0410-b5e6-96231b3b80d8	2013-07-28 08:28:38 +00:00
Elena Demikhovsky	e3809eed34	I'm starting to commit KNL backend. I'll push patches one-by-one. This patch includes support for the extended register set XMM16-31, YMM16-31, ZMM0-31. The full ISA you can see here: http://software.intel.com/en-us/intel-isa-extensions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187030 91177308-0d34-0410-b5e6-96231b3b80d8	2013-07-24 11:02:47 +00:00
Benjamin Kramer	b9548d8ee3	X86: Add target description for btver2; make autodetection logic aware of AVX. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181005 91177308-0d34-0410-b5e6-96231b3b80d8	2013-05-03 10:20:08 +00:00
Preston Gurd	d6ac8e9a03	This patch adds the X86FixupLEAs pass, which will reduce instruction latency for certain models of the Intel Atom family, by converting instructions into their equivalent LEA instructions, when it is both useful and possible to do so. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180573 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-25 20:29:37 +00:00
Chad Rosier	88eb89b89f	[asm parser] Add support for predicating MnemonicAlias based on the assembler variant/dialect. Addresses a FIXME in the emitMnemonicAliases function. Use and test case to come shortly. rdar://13688439 and part of PR13340. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179804 91177308-0d34-0410-b5e6-96231b3b80d8	2013-04-18 22:35:36 +00:00
Michael Liao	c26392aa5d	Add support of RDSEED defined in AVX2 extension git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178314 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 23:41:26 +00:00
Nadav Rotem	59af9d0bf4	Add the Haswell machine model. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178301 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-28 22:34:46 +00:00
Preston Gurd	1edadea42f	For the current Atom processor, the fastest way to handle a call indirect through a memory address is to load the memory address into a register and then call indirect through the register. This patch implements this improvement by modifying SelectionDAG to force a function address which is a memory reference to be loaded into a virtual register. Patch by Sriram Murali. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178171 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-27 19:14:02 +00:00
Michael Liao	0ca1a7f177	Add HLE target feature git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178082 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-26 22:46:02 +00:00
Jakob Stoklund Olesen	6b359ecd43	Enable SandyBridgeModel for all modern Intel P6 descendants. All Intel CPUs since Yonah look a lot alike, at least at the granularity of the scheduling models. We can add more accurate models for processors that aren't Sandy Bridge if required. Haswell will probably need its own. The Atom processor and anything based on NetBurst is completely different. So are the non-Intel chips. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178080 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-26 22:19:12 +00:00
Michael Liao	675eb3b9ac	Add PREFETCHW codegen support - Add 'PRFCHW' feature defined in AVX2 ISA extension git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178040 91177308-0d34-0410-b5e6-96231b3b80d8	2013-03-26 17:47:11 +00:00
Kay Tiong Khoo	7b672ed380	added basic support for Intel ADX instructions -feature flag, instructions definitions, test cases git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@175196 91177308-0d34-0410-b5e6-96231b3b80d8	2013-02-14 19:08:21 +00:00
Preston Gurd	c7b902e7fe	Pad Short Functions for Intel Atom The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. This patch has been updated to address Nadav's review comments - Optimize only at >= O1 and don't do optimization if -Os is set - Stores MachineBasicBlock* instead of BBNum - Uses DenseMap instead of std::map - Fixes placement of braces Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171879 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-08 18:27:24 +00:00
Nadav Rotem	5d1f5c1737	Revert revision 171524. Original message: URL: http://llvm.org/viewvc/llvm-project?rev=171524&view=rev Log: The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171603 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-05 05:42:48 +00:00
Preston Gurd	dd30b47175	The current Intel Atom microarchitecture has a feature whereby when a function returns early then it is slightly faster to execute a sequence of NOP instructions to wait until the return address is ready, as opposed to simply stalling on the ret instruction until the return address is ready. When compiling for X86 Atom only, this patch will run a pass, called "X86PadShortFunction" which will add NOP instructions where less than four cycles elapse between function entry and return. It includes tests. Patch by Andy Zhang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@171524 91177308-0d34-0410-b5e6-96231b3b80d8	2013-01-04 20:54:54 +00:00
Chandler Carruth	5db4bceb47	Make '-mtune=x86_64' assume fast unaligned memory accesses. Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. Reviewed by Dan Gohman. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@170269 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-15 09:01:13 +00:00
Chandler Carruth	6226146f41	Revert "Make '-mtune=x86_64' assume fast unaligned memory accesses." Accidental commit... git svn betrayed me. Sorry for the noise. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169741 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 18:23:52 +00:00
Chandler Carruth	b859d528f3	Make '-mtune=x86_64' assume fast unaligned memory accesses. Summary: Not all chips targeted by x86_64 have this feature, but a dramatically increasing number do. Specifying a chip-specific tuning parameter will continue to turn the feature on or off as appropriate for that particular chip, but the generic flag should try to achieve the best performance on the most widely available hardware. Today, the number of chips with fast UA access dwarfs those without in the x86-64 space. Note that this also brings LLVM's code generation for this '-march' flag more in line with that of modern GCCs. CC: llvm-commits Differential Revision: http://llvm-reviews.chandlerc.com/D195 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@169740 91177308-0d34-0410-b5e6-96231b3b80d8	2012-12-10 18:22:42 +00:00

1 2 3 4

177 Commits