Commit Graph

191 Commits

Author SHA1 Message Date
Sanjay Patel
e4e5cf5a66 make reciprocal estimate code generation more flexible by adding command-line options (3rd try)
The first try (r238051) to land this was reverted due to ExecutionEngine build failure;
that was hopefully addressed by r238788.

The second try (r238842) to land this was reverted due to BUILD_SHARED_LIBS failure;
that was hopefully addressed by r238953.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239001 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 01:32:35 +00:00
Elena Demikhovsky
49659f6378 X86: Added MPX feature and bound registers.
Intel® Memory Protection Extensions (Intel® MPX) is a new feature in Skylake.
It is a part of KNL and SKX sets. It is also a part of Skylake client.

I added definition of %bnd0 - %bnd3 registers, each register is a pair of 64-bit integers.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238916 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 10:30:57 +00:00
Rafael Espindola
a0bcb4184b Revert "make reciprocal estimate code generation more flexible by adding command-line options (2nd try)"
This reverts commit r238842.

It broke -DBUILD_SHARED_LIBS=ON build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238900 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-03 05:32:44 +00:00
Sanjay Patel
871beb8dd7 make reciprocal estimate code generation more flexible by adding command-line options (2nd try)
The first try (r238051) to land this was reverted due to bot failures
that were hopefully addressed by r238788.

This patch adds a TargetRecip class for processing many recip codegen possibilities.
The class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other x86 CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-02 15:28:15 +00:00
Rafael Espindola
58bf2827d3 Revert "make reciprocal estimate code generation more flexible by adding command-line options"
This reverts commit r238051.

It broke some bots:

http://lab.llvm.org:8011/builders/llvm-ppc64-linux1/builds/18190

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-23 00:22:44 +00:00
Sanjay Patel
7e80a67d35 make reciprocal estimate code generation more flexible by adding command-line options
This patch adds a class for processing many recip codegen possibilities.
The TargetRecip class is intended to handle both command-line options to llc as well
as options passed in from a front-end such as clang with the -mrecip option.

The x86 backend is updated to use the new functionality.
Only -mcpu=btver2 with -ffast-math should see a functional change from this patch.
All other CPUs continue to *not* use reciprocal estimates by default with -ffast-math.

Differential Revision: http://reviews.llvm.org/D8982



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@238051 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-22 21:10:06 +00:00
Eric Christopher
0552d51c45 Migrate existing backends that care about software floating point
to use the information in the module rather than TargetOptions.

We've had and clang has used the use-soft-float attribute for some
time now so have the backends set a subtarget feature based on
a particular function now that subtargets are created based on
functions and function attributes.

For the one middle end soft float check go ahead and create
an overloadable TargetLowering::useSoftFloat function that
just checks the TargetSubtargetInfo in all cases.

Also remove the command line option that hard codes whether or
not soft-float is set by using the attribute for all of the
target specific test cases - for the generic just go ahead and
add the attribute in the one case that showed up.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@237079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-05-12 01:26:05 +00:00
Craig Topper
5ffc995ff3 [X86] Remove FeatureAES for 'corei7' CPU. 'corei7' should match 'nehalem' which doesn't have AES. Having AES and not PCLMUL makes 'corei7' halfway between Nehalem and Westmere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233517 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 06:31:11 +00:00
Craig Topper
b8fa51de42 [X86] Remove two feature flags that covered sets of instructions that have no patterns or intrinsics. Since we don't check feature flags in the assembler parser for any instruction sets, these flags don't provide any value. This frees up 2 of the fully utilized feature flags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@228282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-05 08:51:02 +00:00
Sanjay Patel
3cf9267d4e Fix program crashes due to alignment exceptions generated for SSE memop instructions (PR22371).
r224330 introduced a bug by misinterpreting the "FeatureVectorUAMem" bit.
The commit log says that change did not affect anything, but that's not correct.
That change allowed SSE instructions to have unaligned mem operands folded into
math ops, and that's not allowed in the default specification for any SSE variant. 

The bug is exposed when compiling for an AVX-capable CPU that had this feature
flag but without enabling AVX codegen. Another mistake in r224330 was not adding
the feature flag to all AVX CPUs; the AMD chips were excluded.

This is part of the fix for PR22371 ( http://llvm.org/bugs/show_bug.cgi?id=22371 ).

This feature bit is SSE-specific, so I've renamed it to "FeatureSSEUnalignedMem".
Changed the existing test case for the feature bit to reflect the new name and
renamed the test file itself to better reflect the feature.
Added runs to fold-vex.ll to check for the failing codegen.

Note that the feature bit is not set by default on any CPU because it may require a
configuration register setting to enable the enhanced unaligned behavior.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@227983 91177308-0d34-0410-b5e6-96231b3b80d8
2015-02-03 17:13:04 +00:00
Elena Demikhovsky
4519623e9f X86: Added FeatureVectorUAMem for all AVX architectures.
According to AVX specification:

"Most arithmetic and data processing instructions encoded using the VEX prefix and
performing memory accesses have more flexible memory alignment requirements
than instructions that are encoded without the VEX prefix. Specifically,
With the exception of explicitly aligned 16 or 32 byte SIMD load/store instructions,
most VEX-encoded, arithmetic and data processing instructions operate in
a flexible environment regarding memory address alignment, i.e. VEX-encoded
instruction with 32-byte or 16-byte load semantics will support unaligned load
operation by default. Memory arguments for most instructions with VEX prefix
operate normally without causing #GP(0) on any byte-granularity alignment
(unlike Legacy SSE instructions)."

The same for AVX-512.

This change does not affect anything right now, because only the "memop pattern fragment"
depends on FeatureVectorUAMem and it is not used in AVX patterns.
All AVX patterns are based on the "unaligned load" anyway.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@224330 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-16 09:10:08 +00:00
Chandler Carruth
de0cdb0890 [x86] Fix the test to actually test things for the CPU names, add the
missing barcelona CPU which that test uncovered, and remove the 32-bit
x86 CPUs which I really wasn't prepared to audit and test thoroughly.

If anyone wants to clean up the 32-bit only x86 CPUs, go for it.

Also, if anyone else wants to try to de-duplicate the AMD CPUs, that'd
be cool, but from the looks of it wouldn't save as much as it did for
the Intel CPUs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223774 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 14:25:55 +00:00
Chandler Carruth
8729843d31 [x86] Bring some sanity to the x86 CPU processor definitions.
Notably, this adds simple micro-architecture names for the Intel CPU
variants, and defines the old 'core'-based names as aliases. GCC has
started to simplify their documented interface to use these names as
well, so it seems like we can start to converge on a consistent pattern.

I'd appreciate Intel double checking the entries that aren't yet
documented widely, especially Atom (Bonnell and Silvermont), Knights
Landing, and Skylake. But this change shouldn't break any existing
users.

Also, ran clang-format to re-format this code and it actually worked
(modulo a tiny bug) so hopefully we can start to stop thinking about
formatting this stuff.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223769 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-09 10:58:36 +00:00
Michael Liao
d3c452a506 [X86] Clean up whitespace as well as minor coding style
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223339 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 05:20:33 +00:00
Sanjay Patel
c5992119fc Enable FeatureFastUAMem for btver2
Allow unaligned 16-byte memop codegen for btver2. No functional changes for any other subtargets.

Replace the existing supposed small memcpy test with an actual test of a small memcpy. 
The previous test wasn't using FileCheck either.

This patch should allow us to close PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ).

Differential Revision: http://reviews.llvm.org/D6360



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222925 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-28 18:40:18 +00:00
Sanjay Patel
28660d4b2f Add a feature flag for slow 32-byte unaligned memory accesses [x86].
This patch adds a feature flag to avoid unaligned 32-byte load/store AVX codegen
for Sandy Bridge and Ivy Bridge. There is no functionality change intended for 
those chips. Previously, the absence of AVX2 was being used as a proxy to detect
this feature. But that hindered codegen for AVX-enabled AMD chips such as btver2
that do not have the 32-byte unaligned access slowdown.

Performance measurements are included in PR21541 ( http://llvm.org/bugs/show_bug.cgi?id=21541 ).

Differential Revision: http://reviews.llvm.org/D6355



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222544 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 17:40:04 +00:00
Alexey Volkov
d0d0424368 [X86] For Silvermont CPU use 16-bit division instead of 64-bit for small positive numbers
Differential Revision: http://reviews.llvm.org/D5938



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222521 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-21 11:19:34 +00:00
Alexey Volkov
19e8fe05dc [X86] Use ADD/SUB instead of INC/DEC for Haswell and Broadwell CPUs
Differential Revision: http://reviews.llvm.org/D5934



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@222141 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-17 16:17:51 +00:00
Sanjay Patel
e7c966f067 Use rcpss/rcpps (X86) to speed up reciprocal calcs (PR21385).
This is a first step for generating SSE rcp instructions for reciprocal
calcs when fast-math allows it. This is very similar to the rsqrt optimization
enabled in D5658 ( http://reviews.llvm.org/rL220570 ).

For now, be conservative and only enable this for AMD btver2 where performance
improves significantly both in terms of latency and throughput.

We may never enable this codegen for Intel Core* chips because the divider circuits
are just too fast. On SandyBridge, divss can be as fast as 10 cycles versus the 21
cycle critical path for the rcp + mul + sub + mul + add estimate.

Follow-on patches may allow configuration of the number of Newton-Raphson refinement
steps, add AVX512 support, and enable the optimization for more chips.

More background here: http://llvm.org/bugs/show_bug.cgi?id=21385

Differential Revision: http://reviews.llvm.org/D6175



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-11 20:51:00 +00:00
Andrea Di Biagio
c270e8abbe [X86] Add 'FeatureSlowSHLD' to cpu 'bdver3'. Also explicit set FeatureAVX and FeatureSSE4A for all the bdver* cpus.
This patch adds 'FeatureSlowSHLD' to 'bdver3'.
According to the official AMD optimization guide for amdfam15: "Using
alternative code in place of SHLD achieves lower overall latency and
requires fewer execution resources. The 32-bit and 64-bit forms of
ADD, ADC, SHR, and LEA (except 16-bit form) are DirectPath
instructions, while SHLD is a VectorPath instruction."

This patch also explicitly sets feature AVX and SSE4A for all the bdver*
cpus. This part of the patch is a non-functional change and it is mainly
done for clarity reasons (Both XOP and FMA4 already imply AVX and SSE4A).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221296 91177308-0d34-0410-b5e6-96231b3b80d8
2014-11-04 21:18:09 +00:00
Sanjay Patel
a46f06efe2 Use rsqrt (X86) to speed up reciprocal square root calcs
This is a first step for generating SSE rsqrt instructions for
reciprocal square root calcs when fast-math is allowed.

For now, be conservative and only enable this for AMD btver2
where performance improves significantly - for example, 29%
on llvm/projects/test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c
(if we convert the data type to single-precision float).

This patch adds a two constant version of the Newton-Raphson
refinement algorithm to DAGCombiner that can be selected by any target
via a parameter returned by getRsqrtEstimate()..

See PR20900 for more details:
http://llvm.org/bugs/show_bug.cgi?id=20900

Differential Revision: http://reviews.llvm.org/D5658



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-24 17:02:16 +00:00
Sanjay Patel
a9d7398280 Add a scheduling model for AMD 16H Jaguar (btver2).
This is a first pass at a scheduling model for Jaguar.
It's structured largely on the existing SandyBridge and SLM sched models.

Using this model, in addition to turning on the PostRA scheduler, results in 
some perf wins on internal and 3rd party benchmarks. There's not much difference 
in LLVM's test-suite benchmarking subset of tests.

Differential Revision: http://reviews.llvm.org/D5229



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217457 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-09 20:07:07 +00:00
Robert Khasanov
3a34f5e115 [x86] Enable Broadwell target.
Added FeatureSMAP.

Broadwell ISA includes Haswell ISA + ADX + RDSEED + SMAP


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216161 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-21 09:16:12 +00:00
Kevin Enderby
42deb12738 Add support for the X86 secure guard extensions instructions in assembler (SGX).
This allows assembling the two new instructions, encls and enclu for the
SKX processor model.

Note the diffs are a bigger than what might think, but to fit the new
MRM_CF and MRM_D7 in things in the right places things had to be
renumbered and shuffled down causing a bit more diffs.

rdar://16228228


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214460 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-31 23:57:38 +00:00
Robert Khasanov
aac33cfc08 [SKX] Enabling SKX target and AVX512BW, AVX512DQ, AVX512VL features.
Enabling HasAVX512{DQ,BW,VL} predicates.
Adding VK2, VK4, VK32, VK64 masked register classes.
Adding new types (v64i8, v32i16) to VR512.
Extending calling conventions for new types (v64i8, v32i16)

Patch by Zinovy Nis <zinovy.y.nis@intel.com>
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213545 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-21 14:54:21 +00:00
Elena Demikhovsky
0780b6db5d AVX-512: dec/inc instructions are slow on KNL
After Alexey Volkov, I'm adding the same property for KNL, that prefers ADD/SUB instead of INC/DEC.
Added a test.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@212178 91177308-0d34-0410-b5e6-96231b3b80d8
2014-07-02 14:11:05 +00:00
Alexey Volkov
a2bc6951a0 [X86] Use ADD/SUB instead of INC/DEC for Silvermont
According to Intel Software Optimization Manual 
on Silvermont INC or DEC instructions require 
an additional uop to merge the flags.
As a result, a branch instruction depending 
on an INC or a DEC instruction incurs a 1 cycle penalty.

Differential Revision: http://reviews.llvm.org/D3990



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210466 91177308-0d34-0410-b5e6-96231b3b80d8
2014-06-09 11:40:41 +00:00
Alexey Volkov
0d0bab5168 [X86] Tune LEA usage for Silvermont
According to Intel Software Optimization Manual on Silvermont in some cases LEA
is better to be replaced with ADD instructions:
"The rule of thumb for ADDs and LEAs is that it is justified to use LEA
with a valid index and/or displacement for non-destructive destination purposes
(especially useful for stack offset cases), or to use a SCALE.
Otherwise, ADD(s) are preferable."

Differential Revision: http://reviews.llvm.org/D3826


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@209198 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-20 08:55:50 +00:00
Chandler Carruth
905e33545c [x86] Make the 'x86-64' cpu, what I see as and many use as the generic
default architecture for reasonable modern x86 processors, actually be
modern. This processor model should essentially be "tuned" for modern
x86 chips as much as possible without undue penalties on any specific
architecture. Previously we weren't even using the nice scheduling
models. There are a few other tweaks needed here, but this change at
least I have benchmarked across a decent swatch of chips (intel's
clovertown, westmere, and sandybridge; amd's istanbul) and seen no
significant regressions.

If anyone has suggested ways to test this, just let me know. Somewhat
alarmingly, no existing tests failed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208230 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-07 17:37:03 +00:00
Benjamin Kramer
3cddd1607c Add a description for AMD's bdver4 (aka Excavator).
This is just bdver3 + AVX2 + BMI2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207847 91177308-0d34-0410-b5e6-96231b3b80d8
2014-05-02 15:47:07 +00:00
Alexey Volkov
177c1ef30d Enable FeatureFastUAMem for Silvermont processor
Differential Revision: http://llvm-reviews.chandlerc.com/D2982


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203218 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-07 09:03:49 +00:00
Alexey Volkov
adaa3e5760 Test commit
Removed whitespace


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@203216 91177308-0d34-0410-b5e6-96231b3b80d8
2014-03-07 08:28:44 +00:00
Craig Topper
93c9401dff [x86] Add basic support for .code16
This is not really expected to work right yet. Mostly because we will
still emit the OpSize (0x66) prefix in all the wrong places, along with
a number of other corner cases. Those will all be fixed in the subsequent
commits.

Patch from David Woodhouse.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@198584 91177308-0d34-0410-b5e6-96231b3b80d8
2014-01-06 04:55:54 +00:00
Rafael Espindola
4a6855441c Change the default of AsmWriterClassName and isMCAsmWriter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196065 91177308-0d34-0410-b5e6-96231b3b80d8
2013-12-02 04:55:42 +00:00
Ekaterina Romanova
46f7257ed1 SHLD/SHRD are VectorPath (microcode) instructions known to have poor latency on certain architectures. While generating SHLD/SHRD instructions is acceptable when optimizing for size, optimizing for speed on these platforms should be implemented using alternative sequences of instructions composed of add, adc, shr, shl, or and lea which are directPath instructions. These alternative instructions not only have a lower latency but they also increase the decode bandwidth by allowing simultaneous decoding of a third directPath instruction.
AMD's processors family K7, K8, K10, K12, K15 and K16 are known to have SHLD/SHRD instructions with very poor latency. Optimization guides for these processors recommend using an alternative sequence of instructions. For these AMD's processors, I disabled folding (or (x << c) | (y >> (64 - c))) when we are not optimizing for size.

It might be beneficial to disable this folding for some of the Intel's processors. However, since I couldn't find specific recommendations regarding using SHLD/SHRD instructions on Intel's processors, I haven't disabled this peephole for Intel.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@195383 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-21 23:21:26 +00:00
Benjamin Kramer
00e3be6134 X86: Add a description for AMD bdver3 aka Steamroller.
This is just bdver2 + FSGSBase.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@193984 91177308-0d34-0410-b5e6-96231b3b80d8
2013-11-04 10:29:20 +00:00
Yunzhong Gao
cdb9bd7eb9 Enabling 3DNow! prefetch instruction for a few AMD processors: bobcat, jaguar,
bulldozer and piledriver. Support for the instruction itself seems to have
already been added in r178040.

Differential Revision: http://llvm-reviews.chandlerc.com/D1933



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192828 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-16 19:04:11 +00:00
Nick Lewycky
e66dd40d74 Rename this feature to "cx16" to match gcc's flag name. Apparently these strings
are directly tied to the flag names in clang with no remapping in between?


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@192044 91177308-0d34-0410-b5e6-96231b3b80d8
2013-10-05 20:11:44 +00:00
Yunzhong Gao
4da61345ec Adding a feature flag to the llvm backend for x86 TBM instruction set.
Adding TBM feature to bdver2 processor; piledriver supports this instruction set
according to the following document:
http://developer.amd.com/wordpress/media/2012/10/New-Bulldozer-and-Piledriver-Instructions.pdf

Phabricator code review is located here: http://llvm-reviews.chandlerc.com/D1692



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@191324 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-24 18:21:52 +00:00
Preston Gurd
2ff37c701e Remove unused code, which had been commented out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190869 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-17 16:53:36 +00:00
Craig Topper
5fefc00bac Make F16C feature flag imply AVX rather than just checking both at the patterns.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190775 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-16 04:29:58 +00:00
Preston Gurd
94dc6540a8 Adds support for Atom Silvermont (SLM) - -march=slm
Implements Instruction scheduler latencies for Silvermont,
using latencies from the Intel Silvermont Optimization Guide.

Auto detects SLM.

Turns on post RA scheduler when generating code for SLM.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190717 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-13 19:23:28 +00:00
Ben Langmuir
1f1bd9a54d Partial support for Intel SHA Extensions (sha1rnds4)
Add basic assembly/disassembly support for the first Intel SHA
instruction 'sha1rnds4'. Also includes feature flag, and test cases.

Support for the remaining instructions will follow in a separate patch.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@190611 91177308-0d34-0410-b5e6-96231b3b80d8
2013-09-12 15:51:31 +00:00
Benjamin Kramer
d7a178eee3 X86: Add a description of the Intel Atom Silvermont CPU.
Currently this is just the atom model with SSE4.2 enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189669 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-30 14:05:32 +00:00
Rafael Espindola
4aa8bdaa46 Rename features to match what gcc and clang use.
There is no advantage in being different and using the same names simplifies
clang a bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@189141 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-23 20:21:34 +00:00
Craig Topper
33b5fe7f16 Rename mattr names for AVX-512 to from avx-512 -> avx512f, avx-512-pfi -> av512pf, avx-512-cdi -> avx512cd, avx-512-eri->avx512er. This matches better with official docs and what gcc patches appearto be using. I didn't touch the has* functions or the feature flag names to avoid change the td and lowering file while commits are still happening.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@188859 91177308-0d34-0410-b5e6-96231b3b80d8
2013-08-21 03:57:57 +00:00
Elena Demikhovsky
c18f4efc5d Added encoding prefixes for KNL instructions (EVEX).
Added 512-bit operands printing.
Added instruction formats for KNL instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187324 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-28 08:28:38 +00:00
Elena Demikhovsky
e3809eed34 I'm starting to commit KNL backend. I'll push patches one-by-one. This patch includes support for the extended register set XMM16-31, YMM16-31, ZMM0-31.
The full ISA you can see here: http://software.intel.com/en-us/intel-isa-extensions


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@187030 91177308-0d34-0410-b5e6-96231b3b80d8
2013-07-24 11:02:47 +00:00
Benjamin Kramer
b9548d8ee3 X86: Add target description for btver2; make autodetection logic aware of AVX.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@181005 91177308-0d34-0410-b5e6-96231b3b80d8
2013-05-03 10:20:08 +00:00
Preston Gurd
d6ac8e9a03 This patch adds the X86FixupLEAs pass, which will reduce instruction
latency for certain models of the Intel Atom family, by converting
instructions into their equivalent LEA instructions, when it is both
useful and possible to do so.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@180573 91177308-0d34-0410-b5e6-96231b3b80d8
2013-04-25 20:29:37 +00:00