Commit Graph

30888 Commits

Author SHA1 Message Date
Chandler Carruth
f159de96bd [x86] Add a really preposterous number of patterns for matching all of
the various ways in which blends can be used to do vector element
insertion for lowering with the scalar math instruction forms that
effectively re-blend with the high elements after performing the
operation.

This then allows me to bail on the element insertion lowering path when
we have SSE4.1 and are going to be doing a normal blend, which in turn
restores the last of the blends lost from the new vector shuffle
lowering when I got it to prioritize insertion in other cases (for
example when we don't *have* a blend instruction).

Without the patterns, using blends here would have regressed
sse-scalar-fp-arith.ll *completely* with the new vector shuffle
lowering. For completeness, I've added RUN-lines with the new lowering
here. This is somewhat superfluous as I'm about to flip the default, but
hey, it shows that this actually significantly changed behavior.

The patterns I've added are just ridiculously repetative. Suggestions on
making them better very much welcome. In particular, handling the
commuted form of the v2f64 patterns is somewhat obnoxious.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219033 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 22:43:17 +00:00
Chandler Carruth
91ea3e41ae [x86] Adjust the patterns for lowering X86vzmovl nodes which don't
perform a load to use blendps rather than movss when it is available.

For non-loads, blendps is *much* faster. It can execute on two ports in
Sandy Bridge and Ivy Bridge, and *three* ports on Haswell. This fixes
one of the "regressions" from aggressively taking the "insertion" path
in the new vector shuffle lowering.

This does highlight one problem with blendps -- it isn't commuted as
heavily as it should be. That's future work though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219022 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 21:38:49 +00:00
Richard Smith
2451e5b581 PR21145: Teach LLVM about C++14 sized deallocation functions.
C++14 adds new builtin signatures for 'operator delete'. This change allows
new/delete pairs to be removed in C++14 onwards, as they were in C++11 and
before.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219014 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 20:17:06 +00:00
Adam Nemet
726942c8bb [ISel] Keep matching state consistent when folding during X86 address match
In the X86 backend, matching an address is initiated by the 'addr' complex
pattern and its friends.  During this process we may reassociate and-of-shift
into shift-of-and (FoldMaskedShiftToScaledMask) to allow folding of the
shift into the scale of the address.

However as demonstrated by the testcase, this can trigger CSE of not only the
shift and the AND which the code is prepared for but also the underlying load
node.  In the testcase this node is sitting in the RecordedNode and MatchScope
data structures of the matcher and becomes a deleted node upon CSE.  Returning
from the complex pattern function, we try to access it again hitting an assert
because the node is no longer a load even though this was checked before.

Now obviously changing the DAG this late is bending the rules but I think it
makes sense somewhat.  Outside of addresses we prefer and-of-shift because it
may lead to smaller immediates (FoldMaskAndShiftToScale is an even better
example because it create a non-canonical node).  We currently don't recognize
addresses during DAGCombiner where arguably this canonicalization should be
performed.  On the other hand, having this in the matcher allows us to cover
all the cases where an address can be used in an instruction.

I've also talked a little bit to Dan Gohman on llvm-dev who added the RAUW for
the new shift node in FoldMaskedShiftToScaledMask.  This RAUW is responsible
for initiating the recursive CSE on users
(http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-September/076903.html) but it
is not strictly necessary since the shift is hooked into the visited user.  Of
course it's safer to keep the DAG consistent at all times (e.g. for accurate
number of uses, etc.).

So rather than changing the fundamentals, I've decided to continue along the
previous patches and detect the CSE.  This patch installs a very targeted
DAGUpdateListener for the duration of a complex-pattern match and updates the
matching state accordingly.  (Previous patches used HandleSDNode to detect the
CSE but that's not practical here).  The listener is only installed on X86.

I tested that there is no measurable overhead due to this while running
through the spec2k BC files with llc.  The only thing we pay for is the
creation of the listener.  The callback never ever triggers in spec2k since
this is a corner case.

Fixes rdar://problem/18206171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219009 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 20:00:34 +00:00
Tom Stellard
77859c9e9c R600: Align functions to 256 bytes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219002 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 19:02:02 +00:00
Benjamin Kramer
63688e622c Eliminate some deep std::vector copies. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218999 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 18:33:16 +00:00
Robin Morisset
8e2b2ae80e [Power] Use lwsync for non-seq_cst fences
Summary:
hwsync is only required for seq_cst fences, acquire and release one can use
the cheaper lwsync.

Test Plan: Added some cases to atomics.ll + make check-all

Reviewers: jfb, wschmidt

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5317

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218995 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 18:04:36 +00:00
Hans Wennborg
d8af735ca7 MipsAsmParser.cpp: fix VS2012 build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218991 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 17:16:24 +00:00
Hans Wennborg
8651aece06 HexagonMCCodeEmitter.h: deleted member functions are not supported in VS2012
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218990 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 17:02:28 +00:00
Daniel Sanders
61bc405795 [mips] Print warning when using register names not available in N32/64
Summary:
The register names t4-t7 are not available in the N32 and N64 ABIs.
This patch prints a warning, when those names are used in N32/64,
along with a fix-it with the correct register names.

Patch by Vasileios Kalintiris

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5272


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 15:37:37 +00:00
Sid Manning
c4d113192d Fix build break on Hexagon
Differential Revision: http://reviews.llvm.org/D5600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218987 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:59:01 +00:00
Sid Manning
30d67d99cf Adding skeleton for unit testing Hexagon Code Emission
Adding and modifying CMakeLists.txt files to run unit tests under
unittests/Target/* if the directory exists.  Adding basic unit test to check
that code emitter object can be retrieved.

Differential Revision: http://reviews.llvm.org/D5523
Change by: Colin LeMahieu

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218986 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:18:11 +00:00
Chandler Carruth
dce98e6739 [x86] Teach the new vector shuffle lowering to aggressively form MOVSS
and MOVSD nodes for single element vector inserts.

This is particularly important because a number of patterns in the
backend detect these patterns and leverage them to simplify things. It
also fixes quite a few of the insertion bad code examples. However, it
regresses a specific area: when available, blendps and blendpd are
*dramatically* faster than movss and movsd respectively. But it doesn't
really work to form the blend logic first because the blends *aren't* as
crazy efficient when the data is coming from memory anyways, and thus
will have a movss or movsd regardless. Also, doing that would block
a bunch of the patterns that this is designed to hit.

So my plan is to go into the patterns for lowering MOVSS and MOVSD and
lower them via blends when available. However that's a pretty invasive
restructuring so it will need to be a follow-up patch.

I have already gone into the patterns to lower MOVSS and MOVSD from
memory using MOVLPD, etc. Without that, several of the test cases
I already have regress.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 13:11:13 +00:00
Renato Golin
b157cb7afd Revert 202433 - Provide a target override for the latest regalloc heuristic
That commit was introduced in order to help investigate a problem in ARM
codegen breaking from commit 202304 (Add a limit to the heuristic that register
allocates instructions in local order). Recent analisys indicated that the
problem no longer exists, so I'm reverting this change.

See PR18996.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218981 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 12:20:53 +00:00
Chandler Carruth
7ae6f2abf6 [x86] Refactor the element insertion logic in the new vector shuffle
lowering to handle the potential mirroring of 2-element vectors (because
we can't reliably sort them one way) in the caller rather than in the
insertion logic.

This will simplify things considerably as more ways to fail to match the
insertion are added because now we have a nice try and retry point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 12:01:55 +00:00
Chandler Carruth
01b3858e66 [x86] Significantly improve the ability of the new vector shuffle
lowering to match VZEXT_MOVL patterns.

I hadn't realized that these had sufficient pattern smarts in the
backend to lower zext-ing from the low element of a vector without it
being a scalar_to_vector node. They do, and this is how to match a bunch
of patterns for movq, movss, etc.

There is a weird propensity to end up using pshufd to place the element
afterward even though it means domain crossing (or rather, to use
xorps+movss to zext the element rather than movq) but that's an
orthogonal problem with VZEXT_MOVL that someone should probably look at.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218977 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 11:25:58 +00:00
Chandler Carruth
53bf81ae59 [x86] Unbreak SSE1 with the new vector shuffle lowering. We can't widen
element types to form illegal vector types.

I've added a special SSE1 test case here that makes sure we don't break
this going forward.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218974 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 10:11:39 +00:00
Eric Christopher
8f09464bc9 constify TargetMachine parameter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218934 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:42:41 +00:00
Eric Christopher
59cacc9dec constify TargetMachine argument.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218930 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:17:59 +00:00
Eric Christopher
1340986490 We can grab the options struct from the TargetMachine, no need to
pass it down in the constructor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218929 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-03 00:10:03 +00:00
Adam Nemet
6955c9d1ac [AVX512] Pull pattern for subvector insert into the instruction definition
No functional change intended.

Very similar to the change I made for subvector extract in r218480.

test/CodeGen/X86/avx512-insert-extract.ll covers this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218928 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:30 +00:00
Adam Nemet
d9e2cc7fa0 [AVX512] Refactor subvector inserts
No functional change.

Very similar to the extract refactoring I did in r218478.

Compared X86.td.expanded before and after.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218927 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:28 +00:00
Adam Nemet
a9014e5530 [AVX512] Fix i256mem->f256mem typo in VINSERTF64x4rm
Just like in the case of extracts, the refactoring is uncovering some typos in
the code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218926 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 23:18:26 +00:00
Hal Finkel
626236d9bc [PowerPC] Modern Book-E cores support sync
Older Book-E cores, such as the PPC 440, support only msync (which has the same
encoding as sync 0), but not any of the other sync forms. Newer Book-E cores,
however, do support sync, and for performance reasons we should allow the use
of the more-general form.

This refactors msync use into its own feature group so that it applies by
default only to older Book-E cores (of the relevant cores, we only have
definitions for the PPC440/450 currently).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218923 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:34:22 +00:00
Robin Morisset
2b1874cbd4 [Power] Improve the expansion of atomic loads/stores
Summary:
Atomic loads and store of up to the native size (32 bits, or 64 for PPC64)
can be lowered to a simple load or store instruction (as the synchronization
is already handled by AtomicExpand, and the atomicity is guaranteed thanks to
the alignment requirements of atomic accesses). This is exactly what this patch
does. Previously, these were implemented by complex
load-linked/store-conditional loops.. an obvious performance problem.

For example, this patch turns
```
define void @store_i8_unordered(i8* %mem) {
  store atomic i8 42, i8* %mem unordered, align 1
  ret void
}
```
from
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    rlwinm r2, r3, 3, 27, 28
    li r4, 42
    xori r5, r2, 24
    rlwinm r2, r3, 0, 0, 29
    li r3, 255
    slw r4, r4, r5
    slw r3, r3, r5
    and r4, r4, r3
LBB4_1:                                 ; =>This Inner Loop Header: Depth=1
    lwarx r5, 0, r2
    andc r5, r5, r3
    or r5, r4, r5
    stwcx. r5, 0, r2
    bne cr0, LBB4_1
; BB#2:
    blr
```
into
```
_store_i8_unordered:                    ; @store_i8_unordered
; BB#0:
    li r2, 42
    stb r2, 0(r3)
    blr

```
which looks like a pretty clear win to me.

Test Plan:
fixed the tests + new test for indexed accesses + make check-all

Reviewers: jfb, wschmidt, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5587

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218922 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:27:07 +00:00
Juergen Ributzka
b3f91b0af7 [Stackmaps] Make ithe frame-pointer required for stackmaps.
Do not eliminate the frame pointer if there is a stackmap or patchpoint in the
function. All stackmap references should be FP relative.

This fixes PR21107.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218920 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 22:21:49 +00:00
Chandler Carruth
bf21d40070 [x86] Teach the new vector shuffle lowering to widen floating point
elements as well as integer elements in order to form simpler shuffle
patterns.

This is the primary reason why we were failing to match some of the
2-and-2 floating point shuffles such as PR21140. Even after fixing this
we need to support some extra patterns in the backend in order to match
the resulting X86ISD::UNPCKL nodes into the correct instructions. This
commit should fix PR21140 and includes more comprehensive testing of
insertion patterns in v4 shuffles.

Not all of the added tests are beautiful. For example, we don't have
clever instructions to insert-via-load in the integer domain. There are
also some places where we aren't sufficiently cunning with our use of
movq and movd, but that's future work.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 21:37:14 +00:00
Tilmann Scheller
ad7783df73 [NVPTX] Remove dead code.
Found by the Clang static analyzer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 15:12:48 +00:00
Joerg Sonnenberger
92583e0712 Support padding unaligned data in .text.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-02 13:41:42 +00:00
Chandler Carruth
4bbf21e71e [x86] Improve and correct how the new vector shuffle lowering was
matching and lowering 64-bit insertions.

The first problem was that we weren't looking through bitcasts to
discover that we *could* lower as insertions. Once fixed, we in turn
weren't looking through bitcasts to discover that we could fold a load
into the lowering. Once fixed, we weren't forming a SCALAR_TO_VECTOR
node around the inserted element and instead were passing a scalar to
a DAG node that expected a vector. It turns out there are some patterns
that will "lower" this into the correct asm, but the rest of the X86
backend is very unhappy with such antics.

This should fix a few more edge case regressions I've spotted going
through the regression test suite to enable the new vector shuffle
lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218839 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 23:14:28 +00:00
Eric Christopher
300743f74a constify the TargetMachine argument used in the subtarget and
lowering constructors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218832 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:36:28 +00:00
Sanjay Patel
2b918388ab Lower FNEG ( FABS (x) ) -> FNABS (x) [X86 codegen] PR20578
Negative FABS of either a scalar or vector should be handled the same way
on x86 with SSE/AVX: a single OR instruction of the FP operand with a
constant to light up the sign bit(s).

http://llvm.org/bugs/show_bug.cgi?id=20578

Differential Revision: http://reviews.llvm.org/D5201



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218822 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:20:06 +00:00
Eric Christopher
406dccea99 Now that the optimization level is adjusting the feature string
before we hit the subtarget, remove the constructor parameter.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218817 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 21:05:35 +00:00
Eric Christopher
c9038d9c1b Rework the PPC TargetMachine so that the non-function specific
overrides happen at TargetMachine creation and not on every
subtarget creation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 20:38:26 +00:00
Eric Christopher
2e07dedce3 constify TargetMachine parameter for X86TargetLowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 20:38:22 +00:00
Sanjay Patel
72447214a6 Don't repeat function/variable name in comment. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218791 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 19:39:32 +00:00
Tim Northover
472f2a056d ARM: allow copying of CPSR when all else fails.
As with x86 and AArch64, certain situations can arise where we need to spill
CPSR in the middle of a calculation. These should be avoided where possible
(MRS/MSR is rather expensive), which ARM is actually better at than the other
two since it tries to Glue defs to uses, but as a last ditch effort, copying is
better than crashing.

rdar://problem/18011155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218789 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 19:21:03 +00:00
Adrian Prantl
02474a32eb Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

Note: I accidentally committed a bogus older version of this patch previously.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218787 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:55:02 +00:00
Reed Kotler
befab0a552 Add fptrunc to mips fast-sel
Summary: Implement conversion of 64 to 32 bit floating point numbers (fptrunc) in mips fast-isel

Test Plan:
fptrunc.ll
checked also with 4 internal mips build bot flavors mip32r1/miprs32r2 and at -O0 and -O2

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: rfuhler

Differential Revision: http://reviews.llvm.org/D5553

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218785 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:47:02 +00:00
Adrian Prantl
10c4265675 Revert r218778 while investigating buldbot breakage.
"Move the complex address expression out of DIVariable and into an extra"

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218782 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 18:10:54 +00:00
Adrian Prantl
076fd5dfc1 Move the complex address expression out of DIVariable and into an extra
argument of the llvm.dbg.declare/llvm.dbg.value intrinsics.

Previously, DIVariable was a variable-length field that has an optional
reference to a Metadata array consisting of a variable number of
complex address expressions. In the case of OpPiece expressions this is
wasting a lot of storage in IR, because when an aggregate type is, e.g.,
SROA'd into all of its n individual members, the IR will contain n copies
of the DIVariable, all alike, only differing in the complex address
reference at the end.

By making the complex address into an extra argument of the
dbg.value/dbg.declare intrinsics, all of the pieces can reference the
same variable and the complex address expressions can be uniqued across
the CU, too.
Down the road, this will allow us to move other flags, such as
"indirection" out of the DIVariable, too.

The new intrinsics look like this:
declare void @llvm.dbg.declare(metadata %storage, metadata %var, metadata %expr)
declare void @llvm.dbg.value(metadata %storage, i64 %offset, metadata %var, metadata %expr)

This patch adds a new LLVM-local tag to DIExpressions, so we can detect
and pretty-print DIExpression metadata nodes.

What this patch doesn't do:

This patch does not touch the "Indirect" field in DIVariable; but moving
that into the expression would be a natural next step.

http://reviews.llvm.org/D4919
rdar://problem/17994491

Thanks to dblaikie and dexonsmith for reviewing this patch!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218778 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 17:55:39 +00:00
Tom Stellard
56077f5796 R600: Call EmitFunctionHeader() in the AsmPrinter to populate the ELF symbol table
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 17:15:17 +00:00
Toma Tabacu
a2878ec715 [mips] Rename emit and parse functions for the .cpload assembler directive. NFC.
Summary: It's better if we have a consistent name for .cpload-related functions.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5437

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218768 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:53:19 +00:00
Tom Stellard
f7082f9bd7 R600/SI: Add a generic pseudo EXP instruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218767 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:45 +00:00
Tom Stellard
cbb63311cd R600/SI: Add generic pseudo MTBUF instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218766 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:43 +00:00
Tom Stellard
f69ae4815a R600/SI: Add generic pseudo SMRD instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218765 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 14:44:42 +00:00
Oliver Stannard
9d7038c437 [ARM] Allow selecting VRINT[APMXZR] and VCVT[BT] instructions for FPv5
Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218763 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 13:13:18 +00:00
Chandler Carruth
7d64681274 [x86] Fix a few more tiny patterns with the new vector shuffle lowering
that keep cropping up in the regression test suite.

This also addresses one of the issues raised on the mailing list with
failing to form 'movsd' in as many cases as we realistically should.
There will be corresponding patches forthcoming for v4f32 at least. This
was a lot of fuss for a relatively small gain, but all the fuss was on
my end trying different ways of holding the pieces of the x86 fragment
patterns *just right*. Now that it works, the code is reasonably simple.

In the new test cases I'm adding here, v2i64 sticks out as just plain
horrible. I've not come up with any great ideas here other than that it
would be nice to recognize when we're *going* to take a domain crossing
hit and cross earlier to get the decent instructions. At least with AVX
it is slightly less silly....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218756 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 11:14:02 +00:00
Chandler Carruth
a1b88ab2c1 [x86] Delete some extraneous logic from the new vector shuffle lowering.
Nothing was relying on this and there are potentially some edge cases
that it would not be correct under. Removing it seems better than trying
to "fix" it as nothing was relying on it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218755 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 11:13:57 +00:00
Tom Coxon
01649dea92 [AArch64] Allow access to all system registers with MRS/MSR instructions.
The A64 instruction set includes a generic register syntax for accessing
implementation-defined system registers. The syntax for these registers is:
    S<op0>_<op1>_<CRn>_<CRm>_<op2>

The encoding space permitted for implementation-defined system registers
is:
    op0 op1  CRn   CRm   op2
    11  xxx  1x11  xxxx  xxx

The full encoding space can now be accessed:
    op0 op1  CRn   CRm   op2
    xx  xxx  xxxx  xxxx  xxx

This is useful to anyone needing to write assembly code supporting new
system registers before the assembler has learned the official names for
them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218753 91177308-0d34-0410-b5e6-96231b3b80d8
2014-10-01 10:13:59 +00:00