Commit Graph

33641 Commits

Author SHA1 Message Date
Zoran Jovanovic
6b6dc8a1f6 [mips][microMIPSr6] Implement MUL, MUH, MULU and MUHU instructions
Differential Revision: http://reviews.llvm.org/D8894


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236131 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 17:23:22 +00:00
Reid Kleckner
86b699c278 [X86] Avoid mangling frameescape labels
x86 Windows uses the '_' prefix for all global symbols, and this was
mistakenly being applied to frameescape labels, which are not externally
visible global symbols. They use the private global prefix 'L'.

The *right* way to fix this is probably to stop masquerading this label
as an ExternalSymbol and create a new SDNode type. These labels are not
"external", and we know they will be resolved by assembly time. Having a
custom SDNode type would allow us to do better X86 address mode
matching, so it's probably worth doing eventually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236123 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:46:01 +00:00
Duncan P. N. Exon Smith
e56023a059 IR: Give 'DI' prefix to debug info metadata
Finish off PR23080 by renaming the debug info IR constructs from `MD*`
to `DI*`.  The last of the `DIDescriptor` classes were deleted in
r235356, and the last of the related typedefs removed in r235413, so
this has all baked for about a week.

Note: If you have out-of-tree code (like a frontend), I recommend that
you get everything compiling and tests passing with the *previous*
commit before updating to this one.  It'll be easier to keep track of
what code is using the `DIDescriptor` hierarchy and what you've already
updated, and I think you're extremely unlikely to insert bugs.  YMMV of
course.

Back to *this* commit: I did this using the rename-md-di-nodes.sh
upgrade script I've attached to PR23080 (both code and testcases) and
filtered through clang-format-diff.py.  I edited the tests for
test/Assembler/invalid-generic-debug-node-*.ll by hand since the columns
were off-by-three.  It should work on your out-of-tree testcases (and
code, if you've followed the advice in the previous paragraph).

Some of the tests are in badly named files now (e.g.,
test/Assembler/invalid-mdcompositetype-missing-tag.ll should be
'dicompositetype'); I'll come back and move the files in a follow-up
commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:38:44 +00:00
Zoran Jovanovic
3cf9e970d3 [mips][microMIPSr6] Implement SUB and SUBU instructions
Differential Revision: http://reviews.llvm.org/D8764


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236118 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 16:22:46 +00:00
Zoran Jovanovic
b26cc705b0 [mips][microMIPSr6] Implement ADD, ADDU and ADDIU instructions
Differential Revision: http://reviews.llvm.org/D8704


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236111 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 15:11:07 +00:00
James Y Knight
0e13ba8208 Sparc: Prefer reg+reg address encoding when only one register used.
Reg+%g0 is preferred to Reg+imm0 by the manual, and is what GCC produces.

Futhermore, reg+imm is invalid for the (not yet supported) "alternate
address space" instructions.

Differential Revision: http://reviews.llvm.org/D8753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236107 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 14:54:44 +00:00
Vasileios Kalintiris
56d0e00515 Mips fast-isel - handle functions which return i8 or i6 .
Summary: Allow Mips fast-isel to handle functions which return i8/i16 signed/unsigned.

Test Plan:
Make check tests are forthcoming.
Already passes test-suite at O0/O2 for Mips 32 r1/r2

Reviewers: dsanders, rkotler

Subscribers: llvm-commits, rfuhler

Differential Revision: http://reviews.llvm.org/D6765

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236103 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 14:17:14 +00:00
Daniel Sanders
b6d2c5a952 [mips] Correct 128-bit shifts on 64-bit targets.
Summary:
The existing code was correct for 32-bit GPR's but not 64-bit GPR's. It now
accounts for both cases.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits, mohit.bhakkad, sagar

Differential Revision: http://reviews.llvm.org/D9337

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 12:28:58 +00:00
Toma Tabacu
5be17c48fb [mips] [IAS] Inline assemble-time shifting out of createLShiftOri. NFC.
Summary:
Do the assemble-time shifts from createLShiftOri at the source, which groups all the shifting together, closer to the main logic path, and
store the results in concisely-named variables to improve code clarity.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8973

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236096 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 10:19:56 +00:00
Elena Demikhovsky
fbb2bab1e4 fixed 80-chars; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236093 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-29 08:49:57 +00:00
Eric Christopher
f506831ede Reuse a lookup in an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236054 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 22:38:35 +00:00
Tim Northover
9f7d13868a ARM: fix peephole optimisation of TST
We were trying to look through COPY instructions, but only to the next
instruction in a BB and incorrectly anyway. The cases where that would actually
be a good idea are rare enough (and not even tested!) that it's not worth
trying to get right.

rdar://20721342

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236050 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 22:03:55 +00:00
James Y Knight
642098ac59 Sparc: Add alternate aliases for conditional branch instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 21:27:31 +00:00
Alexei Starovoitov
04877fa7ff [bpf] fix build
Patch by Brenden Blanco.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236030 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 20:38:56 +00:00
Sanjay Patel
f060668270 [x86] remove RCPPS and RSQRTPS intrinsic instruction definitions
We don't need codegen-only intrinsic instructions for the vector forms of these instructions.

This makes the reciprocal estimate instruction lowering identical to how we handle normal
square roots: (V)SQRTPS / (V)SQRTPD.

No existing regression tests fail with this patch.

Differential Revision: http://reviews.llvm.org/D9301



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236013 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 18:48:45 +00:00
Eric Christopher
5d88757074 Add a fixme to resetTargetOptions to explain why it needs to go
away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236009 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 18:09:05 +00:00
Eric Christopher
a4944e5188 Fix a [-Werror,-Winconsistent-missing-override] problem in the
NVPTX overrides.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236007 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 18:06:27 +00:00
Tom Stellard
53fec21fbe R600: Fix up for AsmPrinter's OutStreamer being a unique_ptr
Fixes a crash with basically any OpenGL application using the radeonsi
driver.

Patch by: Michel Dänzer

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90176
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 17:37:03 +00:00
Tom Stellard
1630fbeb91 R600/SI: Add a lower case alias for subtarget feature: +DumpCode
llc converts all feature strings to lower case, while the LLVM C API
does not, so we need a lower case alias in order to test this with llc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236003 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 17:37:00 +00:00
Justin Holewinski
0292a66bb1 [NVPTX] Handle addrspacecast constant expressions in aggregate initializers
We need to track if an AddrSpaceCast expression was seen when
generating an MCExpr for a ConstantExpr.  This change introduces a
custom lowerConstant method to the NVPTX asm printer that will create
NVPTXGenericMCSymbolRefExpr nodes at the appropriate places to encode
the information that a given symbol needs to be casted to a generic
address.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@236000 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 17:18:30 +00:00
Sanjay Patel
2a7841dd4d move IR-level optimization flags into their own struct
This is a preliminary step to using the IR-level floating-point fast-math-flags in the SDAG (D8900).

In this patch, we introduce the optimization flags as their own struct. As noted in the TODO comment, 
we should eventually share this data between the IR passes and the backend.

We also switch the existing nsw / nuw / exact bit functionality of the BinaryWithFlagsSDNode class to
use the new struct.

The tradeoff is that instead of using the free but limited space of SDNode's SubclassData, we add a
data member to the subclass. This means we don't have to repeat all of the get/set methods per flag,
but we're potentially adding size to all nodes of this subclassi type.

In practice on 64-bit systems (measured on Linux and MacOS X), there is no size difference between an
SDNode and BinaryWithFlagsSDNode after this change: they're both 80 bytes. This means that we had at
least one free byte to play with due to struct alignment.

Differential Revision: http://reviews.llvm.org/D9325



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235997 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 16:39:12 +00:00
Elena Demikhovsky
83259d70bb Fixed crash of variable shift inst on AVX2
https://llvm.org/bugs/show_bug.cgi?id=22955



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235993 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 14:46:35 +00:00
Toma Tabacu
8bec0f9db1 [mips] [IAS] Do not generate redundant ORi in createLShiftOri.
Summary: If the immediate is 0, the ORi is pointless.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8969

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235990 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 14:06:35 +00:00
Sergey Dmitrouk
1f7a90d793 Reapply r235977 "[DebugInfo] Add debug locations to constant SD nodes"
[DebugInfo] Add debug locations to constant SD nodes

This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235989 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 14:05:47 +00:00
Daniel Jasper
515cc265c9 Revert "[DebugInfo] Add debug locations to constant SD nodes"
This breaks a test:
http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/23870

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 13:38:35 +00:00
Toma Tabacu
f9fc545757 [mips] [IAS] Rename the createShiftOr function to createLShiftOri. NFC.
Summary: The new name is more accurate with regard to the functionality.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8968

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235984 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 13:16:06 +00:00
Toma Tabacu
c3adf30a03 [mips] [IAS] Store the expandLoadImm destination register in a variable. NFC.
Summary: This removes multiple calls to getReg() and saves us column space in the source file.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8924

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235978 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 12:04:53 +00:00
Sergey Dmitrouk
716c5d8a30 [DebugInfo] Add debug locations to constant SD nodes
This adds debug location to constant nodes of Selection DAG and updates
all places that create constants to pass debug locations
(see PR13269).

Can't guarantee that all locations are correct, but in a lot of cases choice
is obvious, so most of them should be. At least all tests pass.

Tests for these changes do not cover everything, instead just check it for
SDNodes, ARM and AArch64 where it's easy to get incorrect locations on
constants.

This is not complete fix as FastISel contains workaround for wrong debug
locations, which drops locations from instructions on processing constants,
but there isn't currently a way to use debug locations from constants there
as llvm::Constant doesn't cache it (yet). Although this is a bit different
issue, not directly related to these changes.

Differential Revision: http://reviews.llvm.org/D9084

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235977 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 11:56:37 +00:00
Elena Demikhovsky
44a0c9071a AVX-512: Added "pandn" intrinsics set
by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235971 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 08:12:42 +00:00
Ahmed Bougacha
e1f835ab59 [MC] Use LShr for constant evaluation of ">>" on ELF/arm64--darwin.
This matches other assemblers and is less unexpected (e.g. PR23227).
On ELF, I tried binutils gas v2.24 and nasm 2.10.09, and they both
agree on LShr.  On COFF, I couldn't get my hands on an assembler yet,
so don't change the behavior.  For now, don't change it on non-AArch64
Darwin either, as the other assembler is gas v1.38, which does an AShr.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 01:37:11 +00:00
Matthias Braun
0dff695ba9 Cleanup, remove unused return value
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235952 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-28 00:37:05 +00:00
Sanjay Patel
fd55d49f65 remove obsolete pattern matches for scalar SSE ops
The blendi pattern should always replace the insertps pattern after:
http://reviews.llvm.org/rL232850
http://reviews.llvm.org/rL235124



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235930 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 22:23:17 +00:00
Ahmed Bougacha
ae618e7873 [AArch64] Also combine vector selects fed by non-i1 SETCCs.
After legalization, scalar SETCC has an i32 result type on AArch64.
The i1 requirement seems too conservative, replace it with an assert.

This also means that we now can run after legalization. That should also
be fine, since the ops legalizer runs again after each combine, and
all types created all have the same sizes as the (legal) inputs.

Exposed by r235917; while there, robustize its tests (bsl also uses the
register it defines).


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235922 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 21:43:12 +00:00
Ahmed Bougacha
bc92b2ca37 [AArch64] Don't assert when combining (v3f32 select (setcc f64)).
When the setcc has f64 operands, we can't build a vector setcc mask
to feed a vselect, because f64 doesn't divide v3f32 evenly.
Just bail out when that happens.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 21:01:20 +00:00
Bill Schmidt
445b1c78a5 Silence unused variable errors for no-asserts builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235913 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 20:22:35 +00:00
Bill Schmidt
dcc4f724cc [PPC64LE] Remove unnecessary swaps from lane-insensitive vector computations
This patch adds a new SSA MI pass that runs on little-endian PPC64
code with VSX enabled. Loads and stores of 4x32 and 2x64 vectors
without alignment constraints are accomplished for little-endian using
lxvd2x/xxswapd and xxswapd/stxvd2x. The existence of the additional
xxswapd instructions hurts performance in comparison with big-endian
code, but they are necessary in the general case to support correct
semantics.

However, the general case does not apply to most vector code. Many
vector instructions are lane-insensitive; they do not "care" which
lanes the parallel computations are performed within, provided that
the resulting data is stored into the correct locations. Thus this
pass looks for computations that perform only lane-insensitive
operations, and remove the unnecessary swaps from loads and stores in
such computations.

Future improvements will allow computations using certain
lane-sensitive operations to also be optimized in this manner, by
modifying the lane-sensitive operations to account for the permuted
order of the lanes. However, this patch only adds the infrastructure
to permit this; no lane-sensitive operations are optimized at this
time.

This code is heavily exercised by the various vectorizing applications
in the projects/test-suite tree. For the time being, I have only added
one simple test case to demonstrate what the pass is doing. Although
it is quite simple, it provides coverage for much of the code,
including the special case handling of copies and subreg-to-reg
operations feeding the swaps. I plan to add additional tests in the
future as I fill in more of the "special handling" code.

Two existing tests were affected, because they expected the swaps to
be present, but they are now removed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235910 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 19:57:34 +00:00
Sanjay Patel
95619d83af fix 80-cols; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235902 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 17:45:44 +00:00
Sanjay Patel
4e9cdece79 fix typos; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 17:03:31 +00:00
Toma Tabacu
b363a5846d [mips] Correct bytes to bits in 2 comments. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 15:21:38 +00:00
Elena Demikhovsky
f8ae1af2e1 AVX-512: added calling conventions for i1 vectors.
Fixed bug: https://llvm.org/bugs/show_bug.cgi?id=20724



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235889 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 15:11:19 +00:00
Brendon Cahoon
2afd045e03 [Hexagon] Use constant extenders to fix up hardware loops
Use a loop instruction with a constant extender for a hardware
loop instruction that is too far away from the start of the loop.
This is cheaper than changing the SA register value.

Differential Revision: http://reviews.llvm.org/D9262


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235882 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 14:16:43 +00:00
Toma Tabacu
af3ec2cfd4 [mips] [IAS] Improve warning for using AT with .set noat.
Summary:
Changed the warning message to show the current value of $at, similar to what clang does for typedef's, and renamed warnIfAssemblerTemporary to a more descriptive name.

I also changed the type of variables which store registers from int to unsigned, updated the relevant test and tried to make the related comments clearer.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235881 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 14:05:04 +00:00
Vasileios Kalintiris
659d53e897 Reapply "[mips][FastISel] Implement shift ops for Mips fast-isel.""
This reapplies r235194, which was reverted in r235495 because it was causing a
failure in our out-of-tree buildbots for MIPS. With the sign-extension patch
in r235718, this patch doesn't cause any problem any more.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235878 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 13:28:05 +00:00
Toma Tabacu
23c4ea3fda [mips] [IAS] Rename getATRegNum and setATReg to {g,s}etATRegIndex. NFC.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8480

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235877 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 13:12:59 +00:00
Elena Demikhovsky
17bbdd05dd AVX-512: Extend/Truncate operations for SKX,
SETCC for bit-vectors



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235875 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 12:57:59 +00:00
Simon Pilgrim
6df35e7844 [X86][SSE] Add v16i8/v32i8 multiplication support
Patch to allow int8 vectors to be multiplied on the SSE unit instead of being scalarized.

The patch sign extends the i8 lanes to i16, uses the SSE2 pmullw multiplication instruction, then packs the lower byte from each result.

Differential Revision: http://reviews.llvm.org/D9115

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235837 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-27 07:55:46 +00:00
Alexei Starovoitov
ad82d26669 [bpf] fix build and remove a compiler warning in Release mode
Patch by Brenden Blanco.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235814 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-26 01:58:08 +00:00
Benjamin Kramer
b6ee10d9e5 [ARM] Simplify code. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235803 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-25 17:25:13 +00:00
Benjamin Kramer
4b86b0db83 [hexagon] Use range-based for loops. No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235802 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-25 14:46:53 +00:00
Benjamin Kramer
79408c7dc8 [hexagon] Remove setHexLibcallName, it leaks memory.
Just spell out the full names, it's not that much more code.
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235801 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-25 14:46:46 +00:00
Lang Hames
579cebfb15 [AsmPrinter] Make AsmPrinter's OutStreamer member a unique_ptr.
AsmPrinter owns the OutStreamer, so an owning pointer makes sense here. Using a
reference for this is crufty.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235752 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 19:11:51 +00:00
Vasileios Kalintiris
7739394dad [mips][FastISel] Specify which types we handle for integer extension.
Summary:
Perform integer extension only when the destination type is one of
i8, i16 & i32 and when the source type is i1, i8 or i16. For other
combinations we fall back to SelectionDAG.

This fixes the test MultiSource/Benchmarks/7zip that was failing in our
out-of-tree MIPS buildbots.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9243

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235718 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 13:48:19 +00:00
Jingyue Wu
728ad0157c Resurrect r235688
We should skip vector types which are not SCEVable.

test/CodeGen/NVPTX/sched2.ll passes


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 04:22:39 +00:00
Jingyue Wu
f42450abb6 Revert r235688
Seems breaking builds


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235690 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 03:26:11 +00:00
Jingyue Wu
b55e9545f2 [NVPTX] Emits "generic()" depending on the original address space
Summary:
Fixes a bug in the NVPTX codegen. The code used to miss necessary "generic()"
on aggregates of addrspacecasts.

Test Plan: addrspacecast-gvar.ll

Reviewers: eliben, jholewinski

Reviewed By: jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235689 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 02:57:30 +00:00
Jingyue Wu
d83b3b1a8d [NVPTX] enable NaryReassociate in NVPTX
Summary:
We run NaryReassociate right after SLSR because SLSR enables many
opportunities for NaryReassociate. For example, in nary-slsr.ll

  foo((a + b) + c);
  foo((a + b * 2) + c);
  foo((a + b * 3) + c);   // 2 muls and 6 adds

after SLSR:

  ab = a + b;
  foo(ab + c);
  ab2 = ab + b;
  foo(ab2 + c);
  ab3 = ab2 + b;
  foo(ab3 + c);           // 6 adds

after NaryReassociate:

  abc = (a + b) + c;
  foo(abc);
  ab2c = abc + b;
  foo(ab2c);
  ab3c = ab2c + b;
  foo(ab3c);              // 4 adds

Test Plan: nary-slsr.ll

Reviewers: jholewinski, eliben

Reviewed By: eliben

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9066

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235688 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 02:54:06 +00:00
Matt Arsenault
9dd5b1fbd8 R600/SI: Fix verifier error when producing v_madmk_f32
Copy the kill flags when swapping the operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235687 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 01:57:58 +00:00
Matthias Braun
bd112c28b5 R600/RegisterCoalescer: Enable more rematerialization/add missing testcase
This enables the rematerialization of some R600 MOV instructions in the
RegisterCoalescer and adds a testcase for r235668.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235675 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-24 00:25:50 +00:00
Matt Arsenault
784f73714b R600/SI: Special case v_mov_b32 as really rematerializable
This should be fixed to properly understand all rematerializable
instructions while ignoring implicit reads of exec.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235671 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 23:34:48 +00:00
Hal Finkel
100eab89f5 [PowerPC] Support register name prefixes for vector registers
Match binutils by supporting the optional register name prefix for new vector
registers ("vs" for VSX registers and "q" for QPX registers).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235665 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 23:16:22 +00:00
Hal Finkel
ba03f542ac [PowerPC] Use sync inst alias when printing
So long as the choice between printing msync and sync is not ambiguous, we can
print 'sync 0' and just 'sync'.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235663 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 23:05:08 +00:00
Tom Stellard
6d49b023a4 R600: Correctly lower CONCAT_VECTOR nodes with more than 2 operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235662 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 22:59:24 +00:00
Hal Finkel
79f43b2736 [PowerPC] Add asm/disasm support for dcbt with hint
Add assembler/disassembler support for dcbt/dcbtst (and aliases) with the hint
field specified (non-zero). Unforunately, the syntax for this instruction is
special in that it differs for server vs. embedded cores:
   dcbt ra, rb, th [server]
   dcbt th, ra, rb [embedded]
where th can be omitted when it is 0. dcbtst is the same. Thus we need to play
games in the parser and the printer to flip the operands around on the embedded
cores. We'll use the server syntax as the default (binutils currently uses the
embedded form by default, but IBM is changing that).

We also stop marking dcbtst as having unmodeled side effects (this is not
necessary, it is just a hint like dcbt -- noticed by inspection, so no separate
test case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 22:47:57 +00:00
Krzysztof Parzyszek
9ed816ae52 Unbreak build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235646 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:57:39 +00:00
Krzysztof Parzyszek
b5ce56f874 [Hexagon] Minor cleanup in HexagonFrameLowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235645 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:42:20 +00:00
Tom Stellard
2aab32cade R600/SI: Fix indirect addressing with a negative constant offset
When the base register index of the vector plus the constant offset
was less than zero, we were passing the wrong base register to the indirect
addressing instruction.

In this case, we need to set the base register to v0 and then add
the computed (negative) index to m0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235641 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:32:01 +00:00
Peter Collingbourne
391b2c39f7 Thumb2: When applying branch optimizations, visit branches in reverse order.
The order in which branches appear in ImmBranches is approximately their
order within the function body. By visiting later branches first, we reduce
the distance between earlier forward branches and their targets, making it
more likely that the cbn?z optimization, which can only apply to forward
branches, will succeed for those earlier branches.

Differential Revision: http://reviews.llvm.org/D9185

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235640 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:35 +00:00
Peter Collingbourne
1ad0f74155 ARM: When re-creating a branch via InsertBranch, preserve CPSR flags.
In particular, this preserves the kill flag, which allows the Thumb2 cbn?z
optimization to be applied in cases where a branch has been re-created after
the live variables analysis pass, e.g. by the machine block placement pass.

This appears to be low risk; a number of other targets seem to already be
doing something similar, e.g. AArch64, PowerPC.

Differential Revision: http://reviews.llvm.org/D9184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235639 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:32 +00:00
Peter Collingbourne
d9a479e5a0 Thumb2: When optimizing for size, do not if-convert branches involving comparisons with zero.
This allows the constant island pass to lower these branches to cbn?z
instructions, resulting in a shorter instruction sequence.

Differential Revision: http://reviews.llvm.org/D9183

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235638 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:30 +00:00
Peter Collingbourne
f86c29ea2c ARM: When spilling extra registers for alignment, prefer low registers on all Thumb targets.
This makes it more likely that we can use the 16-bit push and pop instructions
on Thumb-2, saving around 4 bytes per function.

Differential Revision: http://reviews.llvm.org/D9165

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235637 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:26 +00:00
Peter Collingbourne
b28abbf98b ARM: Only enforce 4-byte alignment on Thumb-2 functions with constant pools.
This appears to have been introduced back in r76698 as part of an unrelated
change. I can find no official ARM documentation stating that Thumb-2 functions
require 4-byte alignment; in fact, ARM documentation appears to contradict
this (see, e.g., ARM Architecture Reference Manual Thumb-2 Supplement,
section 2.6.1: "Thumb-2 enforces 16-bit alignment on all instructions.").

Also remove code that sets alignment for ARM functions, which is redundant
with code in the MachineFunction constructor, and remove the hidden
-arm-align-constant-islands flag, which has been enabled by default since
r146739 (Dec 2011) and has probably received sufficient testing by now.

Differential Revision: http://reviews.llvm.org/D9138

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:31:22 +00:00
Krzysztof Parzyszek
ab740d1e40 [Hexagon] Fix compiler warnings in release build
Patch by Aditya Nandakumar.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:26:21 +00:00
Jingyue Wu
12f341611a [NVPTX] run SeparateConstOffsetFromGEP before SLSR
Summary:
We pick this order because SeparateConstOffsetFromGEP may create more
opportunities for SLSR.

Test Plan:
reassociate-geps-and-slsr.ll
no performance regression on internal benchmarks

Reviewers: meheff

Subscribers: llvm-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D9230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 20:00:04 +00:00
Tom Stellard
59edae9b85 R600/SI: Add assembler support for all CI and VI VOP1 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:54 +00:00
Tom Stellard
f0924551e6 R600/SI: v_mov_fed_b32 does not exist on VI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235628 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:52 +00:00
Tom Stellard
bf02414898 R600/SI: Use a better error message for unsupported instructions in the assembler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:51 +00:00
Tom Stellard
95081f5241 R600/SI: Improve AsmParser support for forced e64 encoding
We can now force e64 encoding even when the operands would be legal
for e32 encoding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235626 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 19:33:48 +00:00
Hal Finkel
184f8f7c10 [PowerPC] Enable printing instructions using aliases
TableGen had been nicely generating code to print a number of instructions using
shorter aliases (and PowerPC has plenty of short mnemonics), but we were not
calling it. For some of the aliases we support in the parser, TableGen can't
infer the "inverse" alias relationship, so there is still more to do.

Thus, after some hours of updating test cases...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235616 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 18:30:38 +00:00
Pirama Arumuga Nainar
dab5145cb3 [AArch64] Add nvcast patterns for v4f16 and v8f16
Summary:
Constant stores of f16 vectors can create NvCast nodes from various
operand types to v4f16 or v8f16 depending on patterns in the stored
constants.  This patch adds nvcast rules with v4f16 and v8f16 values.

AArchISelLowering::LowerBUILD_VECTOR has the details on which constant
patterns generate the nvcast nodes.

Reviewers: jmolloy, srhines, ab

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235610 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:32:25 +00:00
Pirama Arumuga Nainar
b7db5f28c5 [AArch64] Handle vec4, vec8, vec16 *itofp for half
Summary:
Set operation action for SINT_TO_FP and UINT_TO_FP nodes with v4i32,
v8i8, v8i16 inputs to allow promotion of v4f16 results.

Add tests for sitofp and uitofp for vec4, vec8, vec16, and i8, i16, i32,
and i64 vectors.  Only missing tests are for v16i8 and v16i16 as the
shift operations are too complicated to write a proper check sequence.

The conversions from v4i64 to v4f16 do not depend on this patch - v4i64
is split and the conversion gets handled while lowering v2i64.  I am
adding a test here for completeness.

Reviewers: aemerson, rengolin, ab, jmolloy, srhines

Subscribers: rengolin, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D9166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235609 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 17:16:27 +00:00
Hans Wennborg
defaf830f9 Re-commit r235560: Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
Third time's the charm. The previous commit was reverted as a
reverse for-loop in SelectionDAGBuilder::lowerWorkItem did 'I--'
on an iterator at the beginning of a vector, causing asserts
when using debugging iterators. This commit fixes that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235608 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:45:24 +00:00
Krzysztof Parzyszek
de0d4bf1d4 [Hexagon] Shrink-wrap stack frame (Hexagon-specific)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235603 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 16:05:39 +00:00
Toma Tabacu
b685acfe78 [mips] [IAS] Move NOP emission after pseudo-instruction expansion. NFC.
As suggested in the review for http://reviews.llvm.org/D8537.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235601 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 14:48:38 +00:00
Aaron Ballman
5d538f71c2 Revert r235560; this commit was causing several failed assertions in Debug builds using MSVC's STL. The iterator is being used outside of its valid range.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235597 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-23 13:41:59 +00:00
Hans Wennborg
395f4f4b2a Switch lowering: extract jump tables and bit tests before building binary tree (PR22262)
This is a re-commit of r235101, which also fixes the problems with the previous patch:

- Switches with only a default case and non-fallthrough were handled incorrectly

- The previous patch tickled a bug in PowerPC Early-Return Creation which is fixed here.

> This is a major rewrite of the SelectionDAG switch lowering. The previous code
> would lower switches as a binary tre, discovering clusters of cases
> suitable for lowering by jump tables or bit tests as it went along. To increase
> the likelihood of finding jump tables, the binary tree pivot was selected to
> maximize case density on both sides of the pivot.
>
> By not selecting the pivot in the middle, the binary trees would not always
> be balanced, leading to performance problems in the generated code.
>
> This patch rewrites the lowering to search for clusters of cases
> suitable for jump tables or bit tests first, and then builds the binary
> tree around those clusters. This way, the binary tree will always be balanced.
>
> This has the added benefit of decoupling the different aspects of the lowering:
> tree building and jump table or bit tests finding are now easier to tweak
> separately.
>
> For example, this will enable us to balance the tree based on profile info
> in the future.
>
> The algorithm for finding jump tables is quadratic, whereas the previous algorithm
> was O(n log n) for common cases, and quadratic only in the worst-case. This
> doesn't seem to be major problem in practice, e.g. compiling a file consisting
> of a 10k-case switch was only 30% slower, and such large switches should be rare
> in practice. Compiling e.g. gcc.c showed no compile-time difference.  If this
> does turn out to be a problem, we could limit the search space of the algorithm.
>
> This commit also disables all optimizations during switch lowering in -O0.
>
> Differential Revision: http://reviews.llvm.org/D8649

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 23:14:56 +00:00
Krzysztof Parzyszek
bbe056c9bc [Hexagon] Some cleanup of instruction selection code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235552 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 21:17:00 +00:00
Krzysztof Parzyszek
3c55df1e84 [Hexagon] Use A2_tfrsi for constant pool and jump table addresses
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235535 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 18:25:53 +00:00
Pete Cooper
c61f7144eb [AArch64] Use MachineRegisterInfo instead of LiveIntervals to calculate liveness. NFC.
The CondOpt pass currently uses LiveIntervals to set the dead flag on a def.  This patch uses MachineRegisterInfo::use_empty instead as that is equivalent to the def being dead.

This removes an instance of LiveIntervals in the pass manager pipeline and saves 3.8% of compile time on llc conpiled for AArch64.

Reviewed by Chad Rosier and Zhaoshi.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 18:05:13 +00:00
Krzysztof Parzyszek
a21ae0affd [Hexagon] Consider constant-extended offsets to be valid
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 17:51:26 +00:00
Krzysztof Parzyszek
b4e6e4e78d Fix Windows build break: use LLVM_FUNCTION_NAME instead of __func__.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235525 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 17:19:44 +00:00
Matt Arsenault
a37c0d278b R600: Fix always inline pass breaking noinline functions
No test since calls are not actually supported yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235524 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 17:10:44 +00:00
Krzysztof Parzyszek
5ce227e787 [Hexagon] Overhaul of stack object allocation
- Use static allocation for aligned stack objects.
- Simplify dynamic stack object allocation.
- Simplify elimination of frame-indices.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235521 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 16:43:53 +00:00
Sanjay Patel
3f1f6571cc [x86] Add store-folded memop patterns for vcvtps2ph
Differential Revision: http://reviews.llvm.org/D7296



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235517 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 16:11:19 +00:00
Krzysztof Parzyszek
637f84c80e [Hexagon] Treat CFI as solo instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235516 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 15:47:35 +00:00
Krzysztof Parzyszek
bef3fd23e7 [Hexagon] Implement HexagonInstPrinter::printRegName
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235514 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 15:38:17 +00:00
Andrea Di Biagio
6c347524e2 [X86][AVX] Fix failure due to a missing ISel pattern to select VBROADCAST nodes (PR23259).
This fixes a regression introduced at revision 218263.

On AVX, if we optimize for size, a splat build_vector of a load
is lowered into a VBROADCAST node. This is done even if the value type of the
splat build_vector node is v2i64.

Since AVX doesn't support v2f64/v2i64 broadcasts, revision 218263 added two
extra tablegen patterns to allow selecting a VMOVDDUPrm from an X86VBroadcast
where the scalar element comes from a loadi64/loadf64.

However, revision 218263 forgot to add an extra fallback pattern for the case
where we have a X86VBroadcast of a loadi64 with multiple uses.

This patch adds the missing tablegen pattern in X86InstrSSE.td.
This patch also adds an extra test to 'splat-for-size.ll' to verify that ISel
doesn't crash with a 'fatal error in the backend' due to a missing AVX pattern
to select v2i64 X86ISD::BROADCAST nodes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 14:53:39 +00:00
Zoran Jovanovic
d311051513 [mips][microMIPSr6] Implement mips32 to microMIPSr6 mapping support
Differential Revision: http://reviews.llvm.org/D8661


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235505 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 13:27:34 +00:00
Vasileios Kalintiris
21249eca6b Revert "[mips][FastISel] Implement shift ops for Mips fast-isel."
This reverts commit r235194. It was causing a failure in FastISel buildbots
due to sign-extension issues.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 10:08:46 +00:00
James Molloy
89cc8dd3b8 [AArch64] Disable complex GEP optimization by default.
Enough concerns were raised that this optimization is pessimising some code patterns.

The obvious fix, to add a Reassociate run afterwards, causes even more pessimisation in some cases due to fewer complex addressing modes being matched. As there isn't a trivial fix for this, backing this out by default until someone gets a chance to fix the addressing mode matcher.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235491 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 09:11:38 +00:00
Lang Hames
a1c0ce8518 [patchpoint] Add support for symbolic patchpoint targets to SelectionDAG and the
X86 backend.

The code generated for symbolic targets is identical to the code generated for
constant targets, except that a relocation is emitted to fix up the actual
target address at link-time. This allows IR and object files containing
patchpoints to be cached across JIT-invocations where the target address may
change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 06:02:31 +00:00
Sanjay Patel
2b2b3a87da [x86] allow 64-bit extracted vector element integer stores on a 32-bit system
With SSE2, we can generate a 'movq' or other 64-bit store op on a 32-bit system
even though 64-bit integers are not legal types.

So instead of producing this:

  pshufd	$229, %xmm0, %xmm1      ## xmm1 = xmm0[1,1,2,3]
  movd	%xmm0, (%eax)
  movd	%xmm1, 4(%eax)

We can do:

  movq %xmm0, (%eax)

This is a fix for the problem noted in D7296.

Differential Revision: http://reviews.llvm.org/D9134



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-22 00:24:30 +00:00
Krzysztof Parzyszek
a42f6b9a58 [Hexagon] Patterns for frame index with offset for isel
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235418 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 21:28:03 +00:00
Jingyue Wu
0f710d7a5a [NVPTX] do not run DCE after SLSR and SeparateConstOffsetFromGEP
Summary:
With D9096 and D9101, there's no need to run DCE after SLSR and
SeparateConstOffsetFromGEP.

Test Plan: no regression

Reviewers: jholewinski, meheff

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D9172

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 20:47:15 +00:00
Matthias Braun
9e0a1565b9 X86: Match for X86ISD nodes in LowerBUILD_VECTOR instead of BUILD_VECTORCombine
There doesn't seem to be a reason to perform this target ISD node matching
in an DAGCombine, moving it to lowering fixes PR23296.

Differential Revision: http://reviews.llvm.org/D9137

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 17:21:36 +00:00
Elena Demikhovsky
bf704ed348 AVX-512: Added VPMOVx2M instructions for SKX,
fixed encoding of VPMOVM2x.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235385 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 14:38:31 +00:00
Elena Demikhovsky
695922de3d AVX-512: Added VPTESTM and VPTESTNM instructions for SKX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235383 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 13:13:46 +00:00
Toma Tabacu
203a9224ff [mips] [IAS] Implement the .asciiz directive.
Summary:
This directive is exactly the same as .asciz, except it's only used by MIPS.
It is used to store null terminated strings in object files.

Reviewers: rafael, dsanders, echristo

Reviewed By: dsanders, echristo

Subscribers: echristo, llvm-commits

Differential Revision: http://reviews.llvm.org/D7530

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235382 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 11:50:52 +00:00
Jozef Kolek
c589d1b3bc [mips][microMIPSr6] Implement CACHE and PREF instructions
Implement CACHE and PREF instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8893


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235379 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 11:17:25 +00:00
Vasileios Kalintiris
6cd90d3b21 [mips] Cleanup old floating-point flag conditions definitions. NFC.
Reviewers: dsanders

Differential Revision: http://reviews.llvm.org/D7947

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235377 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 10:53:57 +00:00
Vasileios Kalintiris
d72ba1af57 [mips] Optimize code generation for 64-bit variable shift instructions.
Summary:
The 64-bit version of the variable shift instructions uses the
shift_rotate_reg class which uses a GPR32Opnd to specify the variable
shift amount. With this patch we avoid the generation of a redundant
SLL instruction for the variable shift instructions in 64-bit targets.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D7413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235376 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 10:49:03 +00:00
Elena Demikhovsky
a1fa0de258 AVX-512: Added logical and arithmetic instructions for SKX
by Asaf Badouh (asaf.badouh@intel.com)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235375 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 10:27:40 +00:00
Simon Pilgrim
01eaaa72bf [X86][SSE] Provide execution domains for scalar floating point operations
This is an updated version of Chandler's patch D7402 that got accepted but never committed, and has bit-rotted a bit since.

I've updated the execution domain declarations to match the approach of the packed templates and also added some extra scalar unary tests.

Differential Revision: http://reviews.llvm.org/D9095

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 08:40:22 +00:00
Matthias Braun
6fbedc4cfd X86: Do not select X86 custom vector nodes if operand types don't match
X86ISD::ADDSUB, X86ISD::(F)HADD, X86ISD::(F)HSUB should not be selected
if the operand types do not match the result type because vector type
legalization cannot deal with this for custom nodes.

Testcase X86ISD::ADDSUB is attached. I could not create a testcase for
the FHADD/FHSUB cases because of: https://llvm.org/bugs/show_bug.cgi?id=23296

Differential Revision: http://reviews.llvm.org/D9120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235367 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-21 01:13:41 +00:00
Pirama Arumuga Nainar
9d1b182f81 [MIPS] OperationAction for FP_TO_FP16, FP16_TO_FP
Summary:
Set operation action for FP16 conversion opcodes, so the Op legalizer
can choose the gnu_* libcalls for Mips.

Set LoadExtAction and TruncStoreAction for f16 scalars and vectors to
prevent (fpext (load )) and (store (fptrunc)) from getting combined into
unsupported operations.

Added test cases to test that these operations are handled correctly
for f16 scalars and vectors.  This patch depends on
http://reviews.llvm.org/D8755.

Reviewers: srhines

Subscribers: llvm-commits, ab

Differential Revision: http://reviews.llvm.org/D8804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235341 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 20:15:36 +00:00
Jozef Kolek
382bee5224 [mips][microMIPSr6] Implement BITSWAP instruction
Implement BITSWAP instruction using mapping.

Differential Revision: http://reviews.llvm.org/D8857


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235321 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 18:14:59 +00:00
Vladimir Sukharev
d1e387b9e6 [AArch64] LORID_EL1 register must be treated as read-only
Patch by: John Brawn

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D9105


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235314 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 16:54:37 +00:00
Bill Schmidt
70273be423 [PowerPC] Flow oversized lines for r235309
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235310 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 15:58:46 +00:00
Bill Schmidt
54f902e367 [PowerPC] Add future work for vector insert/extract to README_ALTIVEC.txt
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 15:54:26 +00:00
Jozef Kolek
fc4915076f [mips][microMIPSr6] Implement disassembler support
Implement disassembler support for microMIPS32r6.

Differential Revision: http://reviews.llvm.org/D8490


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 14:40:38 +00:00
Jozef Kolek
dbef0175c3 [mips][microMIPSr6] Implement BALC and BC instructions
This patch implements BALC and BC instructions using mapping.

Differential Revision: http://reviews.llvm.org/D8388


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 13:04:14 +00:00
Jozef Kolek
7ab9941632 [mips][microMIPSr6] Implement initial mapping support
Differential Revision: http://reviews.llvm.org/D8387


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235298 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 12:42:08 +00:00
Jozef Kolek
ec47ed84d2 [mips][microMIPSr6] Implement initial subtarget support
Differential Revision: http://reviews.llvm.org/D8386


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235296 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 12:23:06 +00:00
Andrea Di Biagio
14fc08301c [X86][FastIsel] Fix assertion failure when selecting int-to-double conversion (PR23273).
This fixes a regression introduced at revision 231243.
The target-independent selection algorithm in FastISel knows how to select
a SINT_TO_FP if the target is SSE but not AVX. That is because on X86, the
tablegen'd 'fastEmit' functions know how to select CVTSI2SSrr and CVTSI2SDrr.

Method X86FastISel::X86SelectSIToFP was therefore working under the
wrong assumption that the target was AVX. That assumption was incorrect since
we can have a target that is neither AVX nor SSE.

So, rather than asserting for the presence of AVX, we should have had an
early exit from 'X86SelectSIToFP' if the target was not AVX.
This patch fixes the issue replacing the invalid assertion with an early exit.

Thanks to Dimitry Andric for reporting this problem and for providing a small
reproducible testcase. Added test pr23273.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235295 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-20 11:56:59 +00:00
Simon Pilgrim
ca3e6fafc8 [X86][SSE] Fix for getScalarValueForVectorElement to detect scalar sources requiring truncation.
The fix ensures that scalar sources inserted into a vector are the correct bit size.

Integer scalar sources from BUILD_VECTOR and SCALAR_TO_VECTOR nodes may require truncation that this function doesn't currently support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235281 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-19 22:16:49 +00:00
Craig Topper
6fa7febee4 Remove unnecessary include and probably a layering violation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-19 00:57:33 +00:00
Ahmed Bougacha
6b96a388ed [AArch64] Don't force MVT::Untyped when selecting LD1LANEpost.
The result is either an Untyped reg sequence, on ldN with N > 1, or
just the type of the input vector, on ld1.  Don't force Untyped.
Instead, just use the type of the reg sequence.

This mirrors the behavior of createTuple, which feeds the LD1*_POST.

The narrow code path wasn't actually covered by tests, because V64
insert_vector_elt are widened to V128 before the LD1LANEpost combine
has the chance to run, usually.

The only case where it does run on V64 vectors is if the vector ops
legalizer ran.  So, tickle the code with a ctpop.

Fixes PR23265.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235243 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 23:43:33 +00:00
Ahmed Bougacha
7ce2cb4b62 [AArch64] Avoid vector->load dependency cycles when creating LD1*post.
They would break the SelectionDAG.
Note that the opposite load->vector dependency is already obvious in:
  (LD1*post vec, ..)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235224 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 21:02:30 +00:00
Vasileios Kalintiris
5a77e65d39 [mips][FastISel] Implement FastMaterializeAlloca in Mips fast-isel.
Summary: Implement the method FastMaterializeAlloca in Mips fast-isel

Based on a patch by Reed Kotler.

Test Plan:
Passes test-suite at O0/O2 for mips32 r1/r2
fastalloca.ll

Reviewers: dsanders, rkotler

Subscribers: rfuhler, llvm-commits

Differential Revision: http://reviews.llvm.org/D6742

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235213 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 17:29:58 +00:00
Sanjay Patel
c7b16819e8 [X86, AVX] add an exedepfix entry for vmovq == vmovlps == vmovlpd
This is the AVX extension of r235014:
http://llvm.org/viewvc/llvm-project?view=revision&revision=235014

Review:
http://reviews.llvm.org/D8691



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235210 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 17:02:37 +00:00
Vasileios Kalintiris
44bde654f5 [mips][FastISel] Implement shift ops for Mips fast-isel.
Summary:
Add shift operators implementation to fast-isel for Mips.  These are shift ops
for non legal forms, i.e. i8 and i16.

Based on a patch by Reed Kotler.

Test Plan:

Reviewers: dsanders

Subscribers: echristo, rfuhler, llvm-commits

Differential Revision: http://reviews.llvm.org/D6726

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235194 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 14:29:21 +00:00
Rafael Espindola
db244041cd Move AliasedSymbol to MachObjectWriter.
It was only used by MachO.
Part of pr19627.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235185 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 12:28:43 +00:00
Vasileios Kalintiris
187afcd548 [mips] Teach the delay slot filler to remove needless KILL instructions.
Summary:
Previously, the presence of KILL instructions would block valid candidates
from filling a specific delay slot. With the elimination of the KILL
instructions, in the appropriate range, we are able to fill more slots and
keep the information from future def/use analysis consistent.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D7724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 12:01:02 +00:00
Benjamin Kramer
1e5490b167 [mc] Clean up emission of byte sequences
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235178 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 11:12:43 +00:00
Daniel Sanders
01218af77f [mips] Move ABI-dependent register selections to MipsABIInfo. NFC.
Summary:
For example, a common idiom was 'isN64 ? Mips::SP_64 : Mips::SP'. This has
been moved to MipsABIInfo and replaced with 'ABI.GetStackPtr()'.

There are others that should also be moved. This patch sticks to the ones that
are obviously non-functional. The others have minor mistakes that need fixing
at the same time, mostly involving checks for 64-bit GPR's instead of checks
for 64-bit pointers.

Reviewers: tomatabacu

Reviewed By: tomatabacu

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8972

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235173 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-17 09:50:21 +00:00
Ahmed Bougacha
7e3c3ae7c1 [AArch64] Don't assert on f16 in DUP PerfectShuffle generator.
Found by code inspection, but breaking i16 at least breaks other tests.
They aren't checking this in particular though, so also add some
explicit tests for the already working types.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235148 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 23:57:07 +00:00
Pete Cooper
8dd904ce60 Disable AArch64 fast-isel on big-endian call vector returns.
A big-endian vector return needs a byte-swap which we aren't doing right now.

For now just bail on these cases to get correctness back.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235133 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 21:19:36 +00:00
Vladimir Sukharev
cf9593b050 [AArch64] Add v8.1a "Virtualization Host Extensions"
Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8500

Patch by: Tom Coxon


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235107 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:38:58 +00:00
Vladimir Sukharev
39c4ba63f2 [AArch64] Add v8.1a "Limited Ordering Regions" extension
Reviewers: 	t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8499

Patch by: Tom Coxon


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:30:43 +00:00
Vladimir Sukharev
798efb5b3a [AArch64] Add v8.1a "Privileged Access Never" extension
Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8498


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235104 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:20:51 +00:00
Vladimir Sukharev
3445ca85e0 [AArch64] Handle Cyclone-specific register in common way
Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8584

Patch by: Tom Coxon


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235102 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 15:01:20 +00:00
Vladimir Sukharev
c3f3fb7ac2 [AArch64] Follow-up to: Refactor AArch64NamedImmMapper to become dependent on subtarget features
Fixed compilation with clang on some buildbots with "-Werror -Wmissing-field-initializers"

Related to: http://reviews.llvm.org/rL235089


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 14:36:13 +00:00
Toma Tabacu
9f8549a4a0 [mips] [IAS] Preserve microMIPS label marking for objects when assigning.
Summary: Previously, this was only happening for functions, but because of .insn, objects can also be marked now.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8007

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235095 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 13:37:32 +00:00
Benjamin Kramer
e6da045ce1 [Mips] Use unique_ptr to manage ownership.
Required some tweaking of ValueMap to accommodate a move-only value
type. No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235091 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 12:43:33 +00:00
Benjamin Kramer
edc3aaefa6 Make it obvious that we're iterating over a range of pointers.
Found by -Wrange-loop-analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235090 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 12:43:07 +00:00
Vladimir Sukharev
24c3ef3325 [AArch64] Refactor AArch64NamedImmMapper to become dependent on subtarget features.
In order to introduce v8.1a-specific entities, Mappers should be aware of SubtargetFeatures available.

This patch introduces refactoring, that will then allow to easily introduce:

- v8.1-specific "pan" PState for PStateMapper (PAN extension)

- v8.1-specific sysregs for SysRegMapper (LOR,VHE extensions)

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8496

Patch by Tom Coxon


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235089 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 12:15:27 +00:00
James Molloy
5717e28019 [AArch64] Fix invalid use of references to BuildMI.
This was found in GCC PR65773 (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65773).

We shouldn't be taking a reference to the temporary that BuildMI returns, we must copy it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235088 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 11:37:40 +00:00
Vladimir Sukharev
5ade5fcee4 [ARM] Add v8.1a "Privileged Access Never" extension
Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8504


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 11:34:25 +00:00
Toma Tabacu
e7d84301cc [mips] [IAS] Add support for the .insn directive.
Summary:
This assembler directive marks the current label as an instruction label in microMIPS and MIPS16.

This initial implementation works only for microMIPS.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8006

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235084 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 09:53:47 +00:00
Duncan P. N. Exon Smith
9c1aa1c021 DebugInfo: Gut DIScope, DIEnumerator and DISubrange
The only class the still has API left is `DIDescriptor` itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235067 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-16 01:37:00 +00:00
Duncan P. N. Exon Smith
ed0e117ff3 DebugInfo: Gut DICompileUnit and DIFile
Continuing gutting `DIDescriptor` subclasses; this edition,
`DICompileUnit` and `DIFile`.  In the name of PR23080.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235055 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 23:19:27 +00:00
Charlie Turner
fdb3720f58 Fix BXJ is undefined in AArch32.
BXJ was incorrectly said to be unsupported in ARMv8-A. It is not
supported in the A64 instruction set, but it is supported in the T32
and A32 instruction sets, because it's listed as an instruction in the
ARM ARM section F7.1.28.

Using SP as an operand to BXJ changed from UNPREDICTABLE to
PREDICTABLE in v8-A. This patch reflects that update as well.

This was found by MCHammer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 17:28:23 +00:00
Sanjay Patel
e3e5fcab94 [X86] add an exedepfix entry for movq == movlps == movlpd
This is a 1-line patch (with a TODO for AVX because that will affect
even more regression tests) that lets us substitute the appropriate
64-bit store for the float/double/int domains.

It's not clear to me exactly what the difference is between the 0xD6 (MOVPQI2QImr) and 
0x7E (MOVSDto64mr) opcodes, but this is apparently the right choice.

Differential Revision: http://reviews.llvm.org/D8691



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235014 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 15:47:51 +00:00
Sanjay Patel
0332323ab6 [x86] Implement combineRepeatedFPDivisors
Set the transform bar at 2 divisions because the fastest current
x86 FP divider circuit is in SandyBridge / Haswell at 10 cycle
latency (best case) relative to a 5 cycle multiplier. 
So that's the worst case for this transform (no latency win), 
but multiplies are obviously pipelined while divisions are not,
so there's still a big throughput win which we would expect to
show up in typical FP code.

These are the sequences I'm comparing:

  divss   %xmm2, %xmm0
  mulss   %xmm1, %xmm0
  divss   %xmm2, %xmm0

Becomes:

  movss   LCPI0_0(%rip), %xmm3    ## xmm3 = mem[0],zero,zero,zero
  divss   %xmm2, %xmm3
  mulss   %xmm3, %xmm0
  mulss   %xmm1, %xmm0
  mulss   %xmm3, %xmm0

[Ignore for the moment that we don't optimize the chain of 3 multiplies
into 2 independent fmuls followed by 1 dependent fmul...this is the DAG
version of: https://llvm.org/bugs/show_bug.cgi?id=21768 ...if we fix that,
then the transform becomes even more profitable on all targets.]

Differential Revision: http://reviews.llvm.org/D8941



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235012 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 15:22:55 +00:00
Daniel Sanders
8232560496 [msp430] Only support the 'm' inline assembly memory constraint. NFC.
Summary:
MSP430 doesn't seem to have any additional constraints. Therefore remove
the target hook.

No functional change intended.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8208


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@235003 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 12:51:28 +00:00
Toma Tabacu
e3d79bbe9f [mips] [IAS] Refactor the function which checks for the availability of AT. NFC.
Summary:
Refactor MipsAsmParser::getATReg to return an internal register number instead of a register index.
Also change all the int's to unsigned, seeing as the current AT register index is stored as an unsigned in MipsAssemblerOptions.



Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8478

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234996 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 10:48:56 +00:00
Alexei Starovoitov
49b5acf658 [bpf] fix build
fix build due to refactoring in DIL/MDL and raw_pwrite_stream

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234971 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 02:48:57 +00:00
Richard Trieu
64b905a58d Change range-based for-loops to be -Wrange-loop-analysis clean.
No functionality change.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234963 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-15 01:21:15 +00:00
Rafael Espindola
c98092e28d Use raw_pwrite_stream in the object writer/streamer.
The ELF object writer will take advantage of that in the next commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234950 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 22:14:34 +00:00
Ed Maste
ffc045ab80 Correct 'teh' and other typos / repeated words.
Patch by Eitan Adler.

Differential Revision:	http://reviews.llvm.org/D8514


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234939 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 20:52:58 +00:00
Alexander Kornienko
6f422b364d Refactor: Simplify boolean expressions in ARM target
Simplify boolean expressions using `true` and `false` with `clang-tidy`

http://reviews.llvm.org/D8524

Patch by Richard Thomson!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234901 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 15:32:58 +00:00
Bradley Smith
d87c77c0e8 [AArch64] Allow non-standard INS/DUP encodings
The ARMv8 ARMARM states that for these instructions in A64 state:

  "Unspecified bits in "imm5" are ignored but should be set to zero by an assembler.", (imm4 for INS).

Make the disassembler accept any encoding with these ignored bits set to 1.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 15:07:26 +00:00
Tom Stellard
71fcd2d4d8 R600/SI: Fix verifier error caused by SIAnnotateControlFlow
This pass will always try to insert llvm.SI.ifbreak intrinsics
in the same block that its conditional value is computed in.  This is
a problem when conditions for breaks or continue are computed outside
of the loop, because the llvm.SI.ifbreak intrinsic ends up being inserted
outside of the loop.

This patch fixes this problem by inserting the llvm.SI.ifbreak
intrinsics in the loop header when the condition is computed outside
the loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 14:36:45 +00:00
Petar Jovanovic
01b026b023 Re-enable target-specific relocation table sorting and use it for Mips
Some targets (ie. Mips) have additional rules for ordering the relocation
table entries. Allow them to override generic sortRelocs(), which sorts
entries by Offset.
Then override this function for Mips, to emit HI16 and GOT16 relocations
against the local symbol in pair with the corresponding LO16 relocation.

Patch by Vladimir Stefanovic.

Differential Revision: http://reviews.llvm.org/D7414


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234883 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 13:23:34 +00:00
Duncan P. N. Exon Smith
125e3d3959 DebugInfo: Gut DISubprogram and DILexicalBlock*
Gut the `DIDescriptor` wrappers around `MDLocalScope` subclasses.  Note
that `DILexicalBlock` wraps `MDLexicalBlockBase`, not `MDLexicalBlock`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 03:40:37 +00:00
Duncan P. N. Exon Smith
355ec009e5 DebugInfo: Gut DIVariable and DIGlobalVariable
Gut all the non-pointer API from the variable wrappers, except an
implicit conversion from `DIGlobalVariable` to `DIDescriptor`.  Note
that if you're updating out-of-tree code, `DIVariable` wraps
`MDLocalVariable` (`MDVariable` is a common base class shared with
`MDGlobalVariable`).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-14 02:22:36 +00:00
Krzysztof Parzyszek
57ea789c3e Expand ADDO/SUBO on Hexagon
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234795 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 20:37:01 +00:00
Jan Vesely
a017ce21ba Revert revisions r234755, r234759, r234760
Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)"
Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO"
Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB"

Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll
on hexagon, nvptx, and r600. Revert while I investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234768 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:47:15 +00:00
Krzysztof Parzyszek
fcc330abfe Allow memory intrinsics to be tail calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:16:45 +00:00
Jan Vesely
4ce8b9c7fb R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO
v2: tighten the sub64 tests
v3: rename to CARRY/BORROW
v4: fixup test cmdline
    add known bits computation
    use sign extend instead of sub 0,x
    better add test
v5: remove redundant break
    move lowering to separate functions
    fix comments

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewers: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234759 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 16:26:00 +00:00
Jan Vesely
b7239d3aa5 R600: Make FMIN/MAXNUM legal on all asics
v2: Add tests

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234716 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 23:45:05 +00:00
Jan Vesely
8332e14bee R600: remove manual BFE optimization
Fixed since r233079

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234715 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 23:45:01 +00:00
Hal Finkel
ad96655905 [PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrep
When I fixed these a couple of days ago to iterate over all loops, not just
depth == 1 loops, I inadvertently made it such that we'd only look at the first
top-level loop. Make sure that we really look at all of them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234705 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 17:18:56 +00:00
Hal Finkel
1cea397876 [PowerPC] Disable part-word atomics on the P7
As it turns out, even though these are part of ISA 2.06, the P7 does not
support them (or, at least, not any P7s we're tested so far).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 13:40:36 +00:00
Nemanja Ivanovic
9ca031b6c6 Add direct moves to/from VSR and exploit them for FP/INT conversions
This patch corresponds to review:
http://reviews.llvm.org/D8928

It adds direct move instructions to/from VSX registers to GPR's. These are
exploited for FP <-> INT conversions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234682 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 10:40:42 +00:00
Alexander Kornienko
c16fc54851 Use 'override/final' instead of 'virtual' for overridden methods
The patch is generated using clang-tidy misc-use-override check.

This command was used:

  tools/clang/tools/extra/clang-tidy/tool/run-clang-tidy.py \
    -checks='-*,misc-use-override' -header-filter='llvm|clang' \
    -j=32 -fix -format

http://reviews.llvm.org/D8925



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234679 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 02:11:45 +00:00
Hal Finkel
ff49ef7838 [PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loops
This pass had the same problem as the data-prefetching pass: it was only
checking for depth == 1 loops in practice. Fix that, add some debugging
statements, and make sure that, when we grab an AddRec, it is for the loop we
expect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234670 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 00:33:08 +00:00
Ahmed Bougacha
d2069333ee [CodeGen] Split -enable-global-merge into ARM and AArch64 options.
Currently, there's a single flag, checked by the pass itself.
It can't force-enable the pass (and is on by default), because it
might not even have been created, as that's the targets decision.
Instead, have separate explicit flags, so that the decision is
consistently made in the target.

Keep the flag as a last-resort "force-disable GlobalMerge" for now,
for backwards compatibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 00:06:36 +00:00
Quentin Colombet
b26cf11a7b [AArch64] Strengthen the code for the prologue insertion.
The spilled registers are pristine and thus, correctly handled by
the register scavenger and so on, but the liveness information is
strictly speaking wrong at this point.
Fix that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234664 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 23:14:34 +00:00
Hal Finkel
d4f643a5df [PowerPC] Prefetching should also consider depth > 1 loops
Iterating over loops from the LoopInfo instance only provides top-level loops.
We need to search the whole tree of loops to find the inner ones.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234603 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 15:05:02 +00:00
Toma Tabacu
3905bb619c [mips] [IAS] Improve comments in MipsAsmParser::expandLoadImm. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234595 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 13:28:16 +00:00
Chad Rosier
0937a348d0 [AArch64] Changes some SchedAlias to WriteRes for Cortex-A57.
Using SchedAliases is convenient and works well for latency and resource
lookup for instructions.  However, this creates an entry in
AArch64WriteLatencyTable with a WriteResourceID of 0, breaking any
SchedReadAdvance since the lookup will fail.

http://reviews.llvm.org/D8043
Patch by Dave Estes <cestes@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234594 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 13:19:27 +00:00
Chad Rosier
694883f0f7 [AArch64] Adjusts Cortex-A57 machine model to handle zero shift.
http://reviews.llvm.org/D8043
Patch by Dave Estes <cestes@codeaurora.org>!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234593 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 13:19:21 +00:00
Benjamin Kramer
0973b7ddb8 Reduce dyn_cast<> to isa<> or cast<> where possible.
No functional change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234586 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 11:24:51 +00:00
Jingyue Wu
5733100450 Divergence analysis for GPU programs
Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.

Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll

Reviewers: resistor, hfinkel, eliben, meheff, jholewinski

Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234567 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 05:03:50 +00:00
Hal Finkel
36f934a207 [PowerPC] Don't crash on PPC32 i64 fp_to_uint on modern cores
When we have an instruction for this (and, thus, don't generate a runtime
call), we need to custom type legalize this (in a trivial way, just as we do
for fp_to_sint).

Fixes PR23173.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234561 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 03:39:00 +00:00
Ahmed Bougacha
1810ca3110 [AArch64] Promote f16 operations to f32.
For the most common ones (such as fadd), we already did the promotion.
Do the same thing for all the others.

Currently, we'll just crash/assert on all these operations, as
there's no hardware or libcall support whatsoever.

f16 (half) is specified as an interchange - not arithmetic - format,
and is expected to be promoted to single-precision for arithmetic
operations.

While there, teach the legalizer about promoting some of the (mostly
floating-point) operations that we never needed before.

Differential Revision: http://reviews.llvm.org/D8648
See related discussion on the thread for: http://reviews.llvm.org/D8755


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234550 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 00:08:48 +00:00
Nemanja Ivanovic
c58b8f0b65 Add LLVM support for remaining integer divide and permute instructions from ISA 2.06
This is the patch corresponding to review:
http://reviews.llvm.org/D8406

It adds some missing instructions from ISA 2.06 to the PPC back end.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234546 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 23:54:37 +00:00
Rafael Espindola
427c073035 Simplify use of formatted_raw_ostream.
formatted_raw_ostream is a wrapper over another stream to add column and line
number tracking.

It is used only for asm printing.

This patch moves the its creation down to where we know we are printing
assembly. This has the following advantages:

* Simpler lifetime management: std::unique_ptr
* We don't compute column and line number of object files :-)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234535 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 21:06:08 +00:00
Juergen Ributzka
117bf240ef [AArch64][FastISel] Fix integer extend optimization.
The integer extend optimization tries to fold the extend into the load
instruction. This requires us to identify if the extend has already been
emitted or not and act accordingly on it.

The check that was originally performed for this was not sufficient. Besides
checking the ValueMap for a mapped register we also need to check if the
virtual register has already an associated machine instruction that defines it.

This fixes rdar://problem/20470788.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 20:00:46 +00:00
Eric Christopher
a9fa357be4 Remove duplicated code and consolidate initializers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234525 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 19:20:37 +00:00
Rafael Espindola
7e0993d377 clang-format bits of code to make a followup patch easy to read.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234519 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 18:32:58 +00:00
Rafael Espindola
440f0b28a8 Use a raw_svector_ostream instead of a raw_string_ostream.
It saves a bit of copying.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 17:16:25 +00:00
Rafael Espindola
e4053cd377 Don't repeat name in comment. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234506 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 17:10:57 +00:00
Rafael Espindola
a93117c50d This reverts commit r234460 and r234461.
Revert "Add classof implementations to the raw_ostream classes."
Revert "Use the cast machinery to remove dummy uses of formatted_raw_ostream."

The underlying issue can be fixed without classof.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 15:54:59 +00:00
Javed Absar
28c2fda9df [ARM] support for Cortex-R4/R4F
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 14:07:28 +00:00
Toma Tabacu
52c5645668 [mips] Refactor saved-registers bitmask creation in MipsAsmPrinter::printSavedRegsBitmask. NFC.
Summary:
Make the code more readable by fusing the for-loops together and explicitly checking for each register class.

Also, this version is more straightforward because it doesn't assume that FPU registers always come before CPU registers in the CalleeSavedInfo vector.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 10:54:16 +00:00
Kristof Beyls
2a6ad5bfb2 [AArch64] Add support for dynamic stack alignment
Differential Revision: http://reviews.llvm.org/D8876



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 08:49:47 +00:00
Lang Hames
64008ef318 [AArch64] Remove redundant -march option. Also fix a think-o from r234462.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 05:34:57 +00:00
Lang Hames
174f04eefb [AArch64] Teach AArch64TargetLowering::getOptimalMemOpType to consider alignment
restrictions when choosing a type for small-memcpy inlining in
SelectionDAGBuilder.

This ensures that the loads and stores output for the memcpy won't be further
expanded during legalization, which would cause the total number of instructions
for the memcpy to exceed (often significantly) the inlining thresholds.

<rdar://problem/17829180>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 03:40:33 +00:00
Rafael Espindola
23295f613b Use the cast machinery to remove dummy uses of formatted_raw_ostream.
If we know we are producing an object, we don't need to wrap the stream
in a formatted_raw_ostream anymore.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234461 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 02:28:12 +00:00
Scott Douglass
7e2bc24e05 [ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math
Because -menable-no-nans causes fcmp conditions to be rewritten
without 'o' or 'u' the recognition code in needs to cope. Also
extended it to handle 'le' and 'ge.

Differential Revision: http://reviews.llvm.org/D8725


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 17:18:28 +00:00
Toma Tabacu
739ca842aa [mips] [IAS] Do not generate redundant move when expanding lw/sw with symbol.
Summary:
Even though there is no 2nd register operand in the "lw/sw $8, symbol" case, we still try to find one, 
and we end up with $0, which makes us generate an unnecessary "addu $8, $8, $0" (a.k.a. "move $8, $8").

We can avoid this by checking if the 2nd register operand is different from $0, before generating the addu.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234406 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 13:52:41 +00:00
Toma Tabacu
f716ca43ca [mips] [IAS] Add support for the BNEZL and BEQZL pseudo-instructions.
Summary:
They are of the form "bnezl/beqzl $rs, offset" and expand to "bnel/beql $rs, $zero, offset".

These instructions are used in Linux inline assembly.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234401 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 12:15:05 +00:00
Sergey Dmitrouk
a7512d1d4a [ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame
Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 | rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame.

Reviewers: echristo, rengolin, asl, t.p.northover

Reviewed By: t.p.northover

Subscribers: llvm-commits, rengolin, aemerson

Differential Revision: http://reviews.llvm.org/D8606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 10:10:12 +00:00
Toma Tabacu
39fedc9aa2 [mips] [IAS] Remove AssemblerPredicate's from RelocPIC and RelocStatic.
Summary:
These AssemblerPredicate's are unnecessary and actually make some instructions unusable when assembling pre-MIPS32 ISAs.
For example, this was causing the IAS to reject the 'j' instruction for MIPS I-V.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8300

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 10:06:45 +00:00
Alexei Starovoitov
34354ecc30 [bpf] support BPF backend as shared library
dependencies were not set correctly for shared library build.
static was ok

Patch by Brenden Blanco.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234386 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 03:46:51 +00:00
Tom Stellard
65bae699c4 R600/SI: Add some missing overrides
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234384 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 02:07:05 +00:00
Tom Stellard
a787066317 R600/SI: Initial support for assembler and inline assembly
This is currently considered experimental, but most of the more
commonly used instructions should work.

So far only SI has been extensively tested, CI and VI probably work too,
but may be buggy.  The current set of tests cases do not give complete
coverage, but I think it is sufficient for an experimental assembler.

See the documentation in R600Usage for more information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234381 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 01:09:26 +00:00
Tom Stellard
c0578c36bb R600/SI: Add missing SOPK instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 01:09:22 +00:00
Tom Stellard
434e097df8 R600/SI: Don't print offset0/offset1 DS operands when they are 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234379 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 01:09:19 +00:00
Tim Northover
112102c7fe AArch64: disallow "fmov sD, #-0.0" during assembly.
We weren't checking the sign of the floating point immediate before translating
it to "fmov sD, wzr". Similarly for D-regs.

Technically "movi vD.2s, #0x80, lsl #24" would work most of the time, but it's
not a blessed alias (and I don't think it should be since people expect writing
sD to zero out the high lanes, and there's no dD equivalent). So an error it is.

rdar://20455398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 22:49:47 +00:00
Ahmed Bougacha
236bac7a87 [ARM] Mark a bunch of .td Operands with type _MEMORY.
This shouldn't affect anything in-tree, as the OperandType users are
mostly smart disassemblers and such; more information is helpful there.
However, on the flip side, that + the fact that this is just hinting at
the meaning of operands makes this not really test-worthy or testable.

Differential Revision: http://reviews.llvm.org/D8620


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234350 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 20:31:16 +00:00
Alexei Starovoitov
68e6b651a1 [bpf] fix build
fix the build and remove unused variable warnings in Release mode.

Patch by Brenden Blanco.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234349 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 20:25:34 +00:00
Matthias Braun
0d8314f6f3 AArch64: Don't lower ISD::SELECT to ISD::SELECT_CC
Instead of lowering SELECT to SELECT_CC which is further lowered later
immediately call the SELECT_CC lowering code. This is preferable
because:
- Avoids an unnecessary roundtrip through the legalization queues with
  an intermediate node.
- More importantly: Lowered operations get visited last leading to SELECT_CC
  getting visited with legalized operands and unlegalized ones for preexisting
  SELECT_CC nodes. This does not hurt the current code (hence no testcase) but
  is required for another patch I am working on.

Differential Revision: http://reviews.llvm.org/D8187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234334 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 17:33:05 +00:00
Toma Tabacu
db3b3a0b9f [mips] [IAS] Allow .set assignments for already defined symbols.
Summary:
This is not possible when using the IAS for MIPS, but it is possible when using the IAS for other architectures and when using GAS for MIPS.


Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8578

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234316 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 13:59:39 +00:00
Rafael Espindola
838c24a7c8 Refactor a lot of duplicated code for stub output.
This also moves it earlier so that it they are produced before we print
an end symbol for the data section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234315 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 13:42:44 +00:00
Aaron Ballman
f9535e2b1e Silencing several "enumeral and non-enumeral type in conditional expression" warnings; NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234314 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 13:28:37 +00:00
Duncan P. N. Exon Smith
0477045c32 CodeGen: Stop using DIDescriptor::is*() and auto-casting
Same as r234255, but for lib/CodeGen and lib/Target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234258 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 23:27:40 +00:00
Tim Northover
8af3f965e0 ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available.
After recognising that a certain narrow instruction might need a relocation to
be represented, we used to unconditionally relax it to a Thumb2 instruction to
permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2
instructions, so we end up emitting a completely invalid instruction.

Theoretically, ELF does have relocations for these situations; but they are
fairly unusable with such short ranges and the ABI document even says they're
documented "for completeness". So an error is probably better there too.

rdar://20391953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234195 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:44:42 +00:00
Simon Pilgrim
2ec7242600 [X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets
This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.

This allows folding of byte loads and reduces scalar register usage as well.

Differential Revision: http://reviews.llvm.org/D8839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234193 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:39:00 +00:00
Rafael Espindola
e17e7a2400 Remove unnecessary uses of AliasedSymbol.
As pr19627 points out, every use of AliasedSymbol is likely a bug.

The main use was to avoid the oddity of a variable showing up as undefined. That
was fixed in r233995, which made these calls nops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234169 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 16:10:05 +00:00
Rafael Espindola
83be4429b2 Store the sh_link of ARM_EXIDX directly in MCSectionELF.
This avoids some pretty horrible and broken name based section handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234142 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 04:25:18 +00:00
Rafael Espindola
903f4a2051 Implement unique sections with an unique ID.
This allows the compiler/assembly programmer to switch back to a
section. This in turn fixes the bootstrap failure on powerpc (tested
on gcc110) without changing the ppc codegen at all.

I will try to cleanup the various getELFSection overloads in a  followup patch.
Just using a default argument now would lead to ambiguities.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 18:02:01 +00:00
Craig Topper
4d1e15c54f [X86] Apply AddedComplexity consistently for similar patterns. This keeps them together in the DAGISel tables and reduces table size slightly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234086 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 04:22:12 +00:00
Craig Topper
4ed0907298 [X86] Add a comment about the change in r234075.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 02:31:43 +00:00
Craig Topper
b1ff87ec86 [X86] Don't use GR64 register 'and with immediate' instructions if the immediate is zero in the upper 33-bits or upper 57-bits. Use GR32 instructions instead.
Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero.

Fixes PR23100.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 02:08:20 +00:00
David Majnemer
f89ce9a09d [WinEH] Sink UnwindHelp completely out of IR
We don't need to represent UnwindHelp in IR.  Instead, we can use the
knowledge that we are emitting the parent function to decide if we
should create the UnwindHelp stack object.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 22:32:26 +00:00
David Blaikie
4a86b381a3 [opaque pointer type] More GEP IRBuilder API migrations...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234058 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 21:33:42 +00:00
David Blaikie
cf57d81b6e [opaque pointer type] More GEP API migrations in IRBuilder uses
The plan here is to push the API changes out from the common components
(like Constant::getGetElementPtr and IRBuilder::CreateGEP related
functions) and just update callers to either pass the type if it's
obvious, or pass null.

Do this with LoadInst as well and anything else that comes up, then to
start porting specific uses to not pass null anymore - this may require
some refactoring in each case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 19:41:44 +00:00
Duncan P. N. Exon Smith
f4f021c0a4 CodeGen: Assert that inlined-at locations agree
As a follow-up to r234021, assert that a debug info intrinsic variable's
`MDLocalVariable::getInlinedAt()` always matches the
`MDLocation::getInlinedAt()` of its `!dbg` attachment.

The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), but I'll let these assertions bake for a while
first.

If you have an out-of-tree backend that just broke, you're probably
attaching the wrong `DebugLoc` to a `DBG_VALUE` instruction.  The one
you want is the location that was attached to the corresponding
`@llvm.dbg.declare` or `@llvm.dbg.value` call that you started with.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234038 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 19:20:26 +00:00
Simon Pilgrim
be149a8148 [X86] Added SSE4.2 CRC32 memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234013 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 14:24:40 +00:00
Bill Schmidt
1c4b6de5ed [PowerPC] Enable splat generation for BUILD_VECTOR with little endian
When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR
nodes for little endian because wrong results were produced.  I've
subsequently investigated and found this is due to a call to
BuildVectorSDNode::isConstantSplat that was always specifying
big-endian.  With this changed to correctly identify the target
endianness, the optimizations work as expected.

I found another case of a call to the same method with big-endian
hardcoded, in PPC::isAllNegativeZeroVector().  I discovered this was
an orphaned method with no callers, so I've just removed it.

The existing test/CodeGen/PowerPC/vec_constants.ll checks these
optimizations, so for testing I've just added a variant for little
endian.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234011 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 13:48:24 +00:00
Simon Pilgrim
e5ecd32488 [X86][3DNow] Added 3DNow! memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234008 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 11:50:30 +00:00
Peter Collingbourne
c39f5dd0e2 MC: For variable symbols, maintain MCSymbol::Section as a cache.
Fixes PR19582.

Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:

.data
.Llocal:

.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal

the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.

Or in the following asm:

alias_to_local = Ltmp0
Ltmp0:

the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0.  This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.

https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist

After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.

This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.

This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.

Re-applies r233595 (aka D8586), which was reverted in r233898.

Differential Revision: http://reviews.llvm.org/D8798

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 01:46:11 +00:00
Matthias Braun
67a2b82b14 ARM: Handle physreg targets in RegPair hints gracefully
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 00:18:38 +00:00
Sanjay Patel
5b93ab6cde [AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector
Without this patch, we split the 256-bit vector into halves and produced something like:
	movzwl	(%rdi), %eax
	vmovd	%eax, %xmm0
	vxorps	%xmm1, %xmm1, %xmm1
	vblendps	$15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]

Now, we eliminate the xor and blend because those zeros are free with the vmovd:
        movzwl  (%rdi), %eax
        vmovd   %eax, %xmm0

This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233941 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 20:21:52 +00:00
David Blaikie
19443c1bcb [opaque pointer type] API migration for GEP constant factories
Require the pointee type to be passed explicitly and assert that it is
correct. For now it's possible to pass nullptr here (and I've done so in
a few places in this patch) but eventually that will be disallowed once
all clients have been updated or removed. It'll be a long road to get
all the way there... but if you have the cahnce to update your callers
to pass the type explicitly without depending on a pointer's element
type, that would be a good thing to do soon and a necessary thing to do
eventually.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233938 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 18:55:32 +00:00
Quentin Colombet
fa8f2103a5 [AArch64] Add a comment to make it explicit why we increased the complexity.
Follow-up of r233653.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233936 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 18:54:23 +00:00
Sanjay Patel
8765e82c83 [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)
For code like this:

define <8 x i32> @load_v8i32() {
  ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}

We produce this AVX code:

_load_v8i32:                            ## @load_v8i32
  movl	$7, %eax
  vmovd	%eax, %xmm0
  vxorps	%ymm1, %ymm1, %ymm1
  vblendps	$1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
  retq

There are at least 2 bugs in play here:

    We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
    We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.

The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.

[1] AddedComplexity has close to no documentation in the source. 
The best we have is this comment: "roughly corresponds to the number of nodes that are covered". 
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). 

I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ

Differential Revision: http://reviews.llvm.org/D8794



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 17:56:17 +00:00
Vasileios Kalintiris
3fabd02ea5 [mips] Implement eliminateCallFramePseudoInstr() in MipsFrameLowering. NFC.
Summary:
Avoid duplicate code in Mips16FrameLowering and MipsSEFrameLowering by
providing an implementation of the eliminateCallFramePseudoInstr()
function from their base class.

Depends on D8640.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8641

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233909 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 11:09:40 +00:00
Elena Demikhovsky
4eb165220f AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233906 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:51:40 +00:00
Vasileios Kalintiris
91a9b15130 [mips] Expose adjustStackPtr() from MipsInstrInfo. NFC.
Summary:
adjustStackPtr() is implemented from both MipsSEInstrInfo and
Mips16InstrInfo. It makes sense to expose this function from
MipsInstrInfo and avoid explicit casting in some cases.

Depends on D8638.

Reviewers: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8640

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233905 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:42:44 +00:00
Vasileios Kalintiris
cb64898a22 [mips] Make sure that we don't adjust the stack pointer by zero amount.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233904 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:14:54 +00:00
Peter Collingbourne
a8432640e8 Revert r233595, "MC: For variable symbols, maintain MCSymbol::Section as a cache."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233898 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 07:02:51 +00:00
Benjamin Kramer
a066ed09db [X86] Don't accidentally select shll $1, %eax when shrinking an immediate.
addl has higher throughput and this was needlessly picking a suboptimal
encoding causing PR23098.

I wish there was a way of doing this without further duplicating tbl-
generated patterns, but so far I haven't found one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233832 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 19:01:09 +00:00
Vladimir Sukharev
0751793310 [ARM] Rename v8.1a from "extension" to "architecture"
v8.1a is renamed to architecture, following current entity naming approach.

Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8767


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233811 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 14:54:56 +00:00
Vladimir Sukharev
65f303fd0c [AArch64] Rename v8.1a from "extension" to "architecture"
v8.1a is renamed to architecture, accordingly to approaches in ARM backend.

Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8766


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233810 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 14:49:29 +00:00
Ulrich Weigand
adf55a5a57 [SystemZ] Support transactional execution on zEC12
The zEC12 provides the transactional-execution facility.  This is exposed
to users via a set of builtin routines on other compilers.  This patch
adds LLVM support to enable those builtins.  In partciular, the patch:

- adds the transactional-execution and processor-assist facilities
- adds MC support for all instructions provided by those facilities
- adds LLVM intrinsics for those instructions and hooks them up for CodeGen
- adds CodeGen support to optimize CC return value checking

Since this is first use of target-specific intrinsics on the platform,
the patch creates the include/llvm/IR/IntrinsicsSystemZ.td file and
hooks it up in Intrinsics.td.  I've also changed Triple::getArchTypePrefix
to return "s390" instead of "systemz", since the naming convention for
GCC intrinsics uses "s390" on the platform, and it neemed more straight-
forward to use the same convention for LLVM IR intrinsics.

An associated clang patch makes the intrinsics (and command line switches)
available at the source-language level.

For reference, the transactional-execution instructions are documented
in the z/Architecture Principles of Operation for the zEC12:
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf
The associated builtins are documented in the GCC manual:
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html


Index: llvm-head/lib/Target/SystemZ/SystemZOperators.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZOperators.td
+++ llvm-head/lib/Target/SystemZ/SystemZOperators.td
@@ -79,6 +79,9 @@ def SDT_ZI32Intrinsic       : SDTypeProf
 def SDT_ZPrefetch           : SDTypeProfile<0, 2,
                                             [SDTCisVT<0, i32>,
                                              SDTCisPtrTy<1>]>;
+def SDT_ZTBegin             : SDTypeProfile<0, 2,
+                                            [SDTCisPtrTy<0>,
+                                             SDTCisVT<1, i32>]>;
 
 //===----------------------------------------------------------------------===//
 // Node definitions
@@ -180,6 +183,15 @@ def z_prefetch          : SDNode<"System
                                  [SDNPHasChain, SDNPMayLoad, SDNPMayStore,
                                   SDNPMemOperand]>;
 
+def z_tbegin            : SDNode<"SystemZISD::TBEGIN", SDT_ZTBegin,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPMayStore,
+                                  SDNPSideEffect]>;
+def z_tbegin_nofloat    : SDNode<"SystemZISD::TBEGIN_NOFLOAT", SDT_ZTBegin,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPMayStore,
+                                  SDNPSideEffect]>;
+def z_tend              : SDNode<"SystemZISD::TEND", SDTNone,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPSideEffect]>;
+
 //===----------------------------------------------------------------------===//
 // Pattern fragments
 //===----------------------------------------------------------------------===//
Index: llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZInstrFormats.td
+++ llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td
@@ -473,6 +473,17 @@ class InstSS<bits<8> op, dag outs, dag i
   let Inst{15-0}  = BD2;
 }
 
+class InstS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
+  : InstSystemZ<4, outs, ins, asmstr, pattern> {
+  field bits<32> Inst;
+  field bits<32> SoftFail = 0;
+
+  bits<16> BD2;
+
+  let Inst{31-16} = op;
+  let Inst{15-0}  = BD2;
+}
+
 //===----------------------------------------------------------------------===//
 // Instruction definitions with semantics
 //===----------------------------------------------------------------------===//
Index: llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZInstrInfo.td
+++ llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td
@@ -1362,6 +1362,60 @@ let Defs = [CC] in {
 }
 
 //===----------------------------------------------------------------------===//
+// Transactional execution
+//===----------------------------------------------------------------------===//
+
+let Predicates = [FeatureTransactionalExecution] in {
+  // Transaction Begin
+  let hasSideEffects = 1, mayStore = 1,
+      usesCustomInserter = 1, Defs = [CC] in {
+    def TBEGIN : InstSIL<0xE560,
+                         (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                         "tbegin\t$BD1, $I2",
+                         [(z_tbegin bdaddr12only:$BD1, imm32zx16:$I2)]>;
+    def TBEGIN_nofloat : Pseudo<(outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                                [(z_tbegin_nofloat bdaddr12only:$BD1,
+                                                   imm32zx16:$I2)]>;
+    def TBEGINC : InstSIL<0xE561,
+                          (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                          "tbeginc\t$BD1, $I2",
+                          [(int_s390_tbeginc bdaddr12only:$BD1,
+                                             imm32zx16:$I2)]>;
+  }
+
+  // Transaction End
+  let hasSideEffects = 1, Defs = [CC], BD2 = 0 in
+    def TEND : InstS<0xB2F8, (outs), (ins), "tend", [(z_tend)]>;
+
+  // Transaction Abort
+  let hasSideEffects = 1, isTerminator = 1, isBarrier = 1 in
+    def TABORT : InstS<0xB2FC, (outs), (ins bdaddr12only:$BD2),
+                       "tabort\t$BD2",
+                       [(int_s390_tabort bdaddr12only:$BD2)]>;
+
+  // Nontransactional Store
+  let hasSideEffects = 1 in
+    def NTSTG : StoreRXY<"ntstg", 0xE325, int_s390_ntstg, GR64, 8>;
+
+  // Extract Transaction Nesting Depth
+  let hasSideEffects = 1 in
+    def ETND : InherentRRE<"etnd", 0xB2EC, GR32, (int_s390_etnd)>;
+}
+
+//===----------------------------------------------------------------------===//
+// Processor assist
+//===----------------------------------------------------------------------===//
+
+let Predicates = [FeatureProcessorAssist] in {
+  let hasSideEffects = 1, R4 = 0 in
+    def PPA : InstRRF<0xB2E8, (outs), (ins GR64:$R1, GR64:$R2, imm32zx4:$R3),
+                      "ppa\t$R1, $R2, $R3", []>;
+  def : Pat<(int_s390_ppa_txassist GR32:$src),
+            (PPA (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GR32:$src, subreg_l32),
+                 0, 1)>;
+}
+
+//===----------------------------------------------------------------------===//
 // Miscellaneous Instructions.
 //===----------------------------------------------------------------------===//
 
Index: llvm-head/lib/Target/SystemZ/SystemZProcessors.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZProcessors.td
+++ llvm-head/lib/Target/SystemZ/SystemZProcessors.td
@@ -60,6 +60,16 @@ def FeatureMiscellaneousExtensions : Sys
   "Assume that the miscellaneous-extensions facility is installed"
 >;
 
+def FeatureTransactionalExecution : SystemZFeature<
+  "transactional-execution", "TransactionalExecution",
+  "Assume that the transactional-execution facility is installed"
+>;
+
+def FeatureProcessorAssist : SystemZFeature<
+  "processor-assist", "ProcessorAssist",
+  "Assume that the processor-assist facility is installed"
+>;
+
 def : Processor<"generic", NoItineraries, []>;
 def : Processor<"z10", NoItineraries, []>;
 def : Processor<"z196", NoItineraries,
@@ -70,4 +80,5 @@ def : Processor<"zEC12", NoItineraries,
                 [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
                  FeatureFPExtension, FeaturePopulationCount,
                  FeatureFastSerialization, FeatureInterlockedAccess1,
-                 FeatureMiscellaneousExtensions]>;
+                 FeatureMiscellaneousExtensions,
+                 FeatureTransactionalExecution, FeatureProcessorAssist]>;
Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.cpp
+++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp
@@ -40,6 +40,7 @@ SystemZSubtarget::SystemZSubtarget(const
       HasLoadStoreOnCond(false), HasHighWord(false), HasFPExtension(false),
       HasPopulationCount(false), HasFastSerialization(false),
       HasInterlockedAccess1(false), HasMiscellaneousExtensions(false),
+      HasTransactionalExecution(false), HasProcessorAssist(false),
       TargetTriple(TT), InstrInfo(initializeSubtargetDependencies(CPU, FS)),
       TLInfo(TM, *this), TSInfo(*TM.getDataLayout()), FrameLowering() {}
 
Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.h
+++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.h
@@ -42,6 +42,8 @@ protected:
   bool HasFastSerialization;
   bool HasInterlockedAccess1;
   bool HasMiscellaneousExtensions;
+  bool HasTransactionalExecution;
+  bool HasProcessorAssist;
 
 private:
   Triple TargetTriple;
@@ -102,6 +104,12 @@ public:
     return HasMiscellaneousExtensions;
   }
 
+  // Return true if the target has the transactional-execution facility.
+  bool hasTransactionalExecution() const { return HasTransactionalExecution; }
+
+  // Return true if the target has the processor-assist facility.
+  bool hasProcessorAssist() const { return HasProcessorAssist; }
+
   // Return true if GV can be accessed using LARL for reloc model RM
   // and code model CM.
   bool isPC32DBLSymbol(const GlobalValue *GV, Reloc::Model RM,
Index: llvm-head/lib/Support/Triple.cpp
===================================================================
--- llvm-head.orig/lib/Support/Triple.cpp
+++ llvm-head/lib/Support/Triple.cpp
@@ -92,7 +92,7 @@ const char *Triple::getArchTypePrefix(Ar
   case sparcv9:
   case sparc:       return "sparc";
 
-  case systemz:     return "systemz";
+  case systemz:     return "s390";
 
   case x86:
   case x86_64:      return "x86";
Index: llvm-head/include/llvm/IR/Intrinsics.td
===================================================================
--- llvm-head.orig/include/llvm/IR/Intrinsics.td
+++ llvm-head/include/llvm/IR/Intrinsics.td
@@ -634,3 +634,4 @@ include "llvm/IR/IntrinsicsNVVM.td"
 include "llvm/IR/IntrinsicsMips.td"
 include "llvm/IR/IntrinsicsR600.td"
 include "llvm/IR/IntrinsicsBPF.td"
+include "llvm/IR/IntrinsicsSystemZ.td"
Index: llvm-head/include/llvm/IR/IntrinsicsSystemZ.td
===================================================================
--- /dev/null
+++ llvm-head/include/llvm/IR/IntrinsicsSystemZ.td
@@ -0,0 +1,46 @@
+//===- IntrinsicsSystemZ.td - Defines SystemZ intrinsics ---*- tablegen -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines all of the SystemZ-specific intrinsics.
+//
+//===----------------------------------------------------------------------===//
+
+//===----------------------------------------------------------------------===//
+//
+// Transactional-execution intrinsics
+//
+//===----------------------------------------------------------------------===//
+
+let TargetPrefix = "s390" in {
+  def int_s390_tbegin : Intrinsic<[llvm_i32_ty], [llvm_ptr_ty, llvm_i32_ty],
+                                  [IntrNoDuplicate]>;
+
+  def int_s390_tbegin_nofloat : Intrinsic<[llvm_i32_ty],
+                                          [llvm_ptr_ty, llvm_i32_ty],
+                                          [IntrNoDuplicate]>;
+
+  def int_s390_tbeginc : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],
+                                   [IntrNoDuplicate]>;
+
+  def int_s390_tabort : Intrinsic<[], [llvm_i64_ty],
+                                  [IntrNoReturn, Throws]>;
+
+  def int_s390_tend : GCCBuiltin<"__builtin_tend">,
+                      Intrinsic<[llvm_i32_ty], []>;
+
+  def int_s390_etnd : GCCBuiltin<"__builtin_tx_nesting_depth">,
+                      Intrinsic<[llvm_i32_ty], [], [IntrNoMem]>;
+
+  def int_s390_ntstg : Intrinsic<[], [llvm_i64_ty, llvm_ptr64_ty],
+                                 [IntrReadWriteArgMem]>;
+
+  def int_s390_ppa_txassist : GCCBuiltin<"__builtin_tx_assist">,
+                              Intrinsic<[], [llvm_i32_ty]>;
+}
+
Index: llvm-head/lib/Target/SystemZ/SystemZ.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZ.h
+++ llvm-head/lib/Target/SystemZ/SystemZ.h
@@ -68,6 +68,18 @@ const unsigned CCMASK_TM_MSB_0       = C
 const unsigned CCMASK_TM_MSB_1       = CCMASK_2 | CCMASK_3;
 const unsigned CCMASK_TM             = CCMASK_ANY;
 
+// Condition-code mask assignments for TRANSACTION_BEGIN.
+const unsigned CCMASK_TBEGIN_STARTED       = CCMASK_0;
+const unsigned CCMASK_TBEGIN_INDETERMINATE = CCMASK_1;
+const unsigned CCMASK_TBEGIN_TRANSIENT     = CCMASK_2;
+const unsigned CCMASK_TBEGIN_PERSISTENT    = CCMASK_3;
+const unsigned CCMASK_TBEGIN               = CCMASK_ANY;
+
+// Condition-code mask assignments for TRANSACTION_END.
+const unsigned CCMASK_TEND_TX   = CCMASK_0;
+const unsigned CCMASK_TEND_NOTX = CCMASK_2;
+const unsigned CCMASK_TEND      = CCMASK_TEND_TX | CCMASK_TEND_NOTX;
+
 // The position of the low CC bit in an IPM result.
 const unsigned IPM_CC = 28;
 
Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.h
+++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.h
@@ -146,6 +146,15 @@ enum {
   // Perform a serialization operation.  (BCR 15,0 or BCR 14,0.)
   SERIALIZE,
 
+  // Transaction begin.  The first operand is the chain, the second
+  // the TDB pointer, and the third the immediate control field.
+  // Returns chain and glue.
+  TBEGIN,
+  TBEGIN_NOFLOAT,
+
+  // Transaction end.  Just the chain operand.  Returns chain and glue.
+  TEND,
+
   // Wrappers around the inner loop of an 8- or 16-bit ATOMIC_SWAP or
   // ATOMIC_LOAD_<op>.
   //
@@ -318,6 +327,7 @@ private:
   SDValue lowerSTACKSAVE(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerSTACKRESTORE(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerPREFETCH(SDValue Op, SelectionDAG &DAG) const;
+  SDValue lowerINTRINSIC_W_CHAIN(SDValue Op, SelectionDAG &DAG) const;
 
   // If the last instruction before MBBI in MBB was some form of COMPARE,
   // try to replace it with a COMPARE AND BRANCH just before MBBI.
@@ -355,6 +365,10 @@ private:
   MachineBasicBlock *emitStringWrapper(MachineInstr *MI,
                                        MachineBasicBlock *BB,
                                        unsigned Opcode) const;
+  MachineBasicBlock *emitTransactionBegin(MachineInstr *MI,
+                                          MachineBasicBlock *MBB,
+                                          unsigned Opcode,
+                                          bool NoFloat) const;
 };
 } // end namespace llvm
 
Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -20,6 +20,7 @@
 #include "llvm/CodeGen/MachineInstrBuilder.h"
 #include "llvm/CodeGen/MachineRegisterInfo.h"
 #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
+#include "llvm/IR/Intrinsics.h"
 #include <cctype>
 
 using namespace llvm;
@@ -304,6 +305,9 @@ SystemZTargetLowering::SystemZTargetLowe
   // Codes for which we want to perform some z-specific combinations.
   setTargetDAGCombine(ISD::SIGN_EXTEND);
 
+  // Handle intrinsics.
+  setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);
+
   // We want to use MVC in preference to even a single load/store pair.
   MaxStoresPerMemcpy = 0;
   MaxStoresPerMemcpyOptSize = 0;
@@ -1031,6 +1035,53 @@ prepareVolatileOrAtomicLoad(SDValue Chai
   return DAG.getNode(SystemZISD::SERIALIZE, DL, MVT::Other, Chain);
 }
 
+// Return true if Op is an intrinsic node with chain that returns the CC value
+// as its only (other) argument.  Provide the associated SystemZISD opcode and
+// the mask of valid CC values if so.
+static bool isIntrinsicWithCCAndChain(SDValue Op, unsigned &Opcode,
+                                      unsigned &CCValid) {
+  unsigned Id = cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue();
+  switch (Id) {
+  case Intrinsic::s390_tbegin:
+    Opcode = SystemZISD::TBEGIN;
+    CCValid = SystemZ::CCMASK_TBEGIN;
+    return true;
+
+  case Intrinsic::s390_tbegin_nofloat:
+    Opcode = SystemZISD::TBEGIN_NOFLOAT;
+    CCValid = SystemZ::CCMASK_TBEGIN;
+    return true;
+
+  case Intrinsic::s390_tend:
+    Opcode = SystemZISD::TEND;
+    CCValid = SystemZ::CCMASK_TEND;
+    return true;
+
+  default:
+    return false;
+  }
+}
+
+// Emit an intrinsic with chain with a glued value instead of its CC result.
+static SDValue emitIntrinsicWithChainAndGlue(SelectionDAG &DAG, SDValue Op,
+                                             unsigned Opcode) {
+  // Copy all operands except the intrinsic ID.
+  unsigned NumOps = Op.getNumOperands();
+  SmallVector<SDValue, 6> Ops;
+  Ops.reserve(NumOps - 1);
+  Ops.push_back(Op.getOperand(0));
+  for (unsigned I = 2; I < NumOps; ++I)
+    Ops.push_back(Op.getOperand(I));
+
+  assert(Op->getNumValues() == 2 && "Expected only CC result and chain");
+  SDVTList RawVTs = DAG.getVTList(MVT::Other, MVT::Glue);
+  SDValue Intr = DAG.getNode(Opcode, SDLoc(Op), RawVTs, Ops);
+  SDValue OldChain = SDValue(Op.getNode(), 1);
+  SDValue NewChain = SDValue(Intr.getNode(), 0);
+  DAG.ReplaceAllUsesOfValueWith(OldChain, NewChain);
+  return Intr;
+}
+
 // CC is a comparison that will be implemented using an integer or
 // floating-point comparison.  Return the condition code mask for
 // a branch on true.  In the integer case, CCMASK_CMP_UO is set for
@@ -1588,9 +1639,53 @@ static void adjustForTestUnderMask(Selec
   C.CCMask = NewCCMask;
 }
 
+// Return a Comparison that tests the condition-code result of intrinsic
+// node Call against constant integer CC using comparison code Cond.
+// Opcode is the opcode of the SystemZISD operation for the intrinsic
+// and CCValid is the set of possible condition-code results.
+static Comparison getIntrinsicCmp(SelectionDAG &DAG, unsigned Opcode,
+                                  SDValue Call, unsigned CCValid, uint64_t CC,
+                                  ISD::CondCode Cond) {
+  Comparison C(Call, SDValue());
+  C.Opcode = Opcode;
+  C.CCValid = CCValid;
+  if (Cond == ISD::SETEQ)
+    // bit 3 for CC==0, bit 0 for CC==3, always false for CC>3.
+    C.CCMask = CC < 4 ? 1 << (3 - CC) : 0;
+  else if (Cond == ISD::SETNE)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(1 << (3 - CC)) : -1;
+  else if (Cond == ISD::SETLT || Cond == ISD::SETULT)
+    // bits above bit 3 for CC==0 (always false), bits above bit 0 for CC==3,
+    // always true for CC>3.
+    C.CCMask = CC < 4 ? -1 << (4 - CC) : -1;
+  else if (Cond == ISD::SETGE || Cond == ISD::SETUGE)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(-1 << (4 - CC)) : 0;
+  else if (Cond == ISD::SETLE || Cond == ISD::SETULE)
+    // bit 3 and above for CC==0, bit 0 and above for CC==3 (always true),
+    // always true for CC>3.
+    C.CCMask = CC < 4 ? -1 << (3 - CC) : -1;
+  else if (Cond == ISD::SETGT || Cond == ISD::SETUGT)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(-1 << (3 - CC)) : 0;
+  else
+    llvm_unreachable("Unexpected integer comparison type");
+  C.CCMask &= CCValid;
+  return C;
+}
+
 // Decide how to implement a comparison of type Cond between CmpOp0 with CmpOp1.
 static Comparison getCmp(SelectionDAG &DAG, SDValue CmpOp0, SDValue CmpOp1,
                          ISD::CondCode Cond) {
+  if (CmpOp1.getOpcode() == ISD::Constant) {
+    uint64_t Constant = cast<ConstantSDNode>(CmpOp1)->getZExtValue();
+    unsigned Opcode, CCValid;
+    if (CmpOp0.getOpcode() == ISD::INTRINSIC_W_CHAIN &&
+        CmpOp0.getResNo() == 0 && CmpOp0->hasNUsesOfValue(1, 0) &&
+        isIntrinsicWithCCAndChain(CmpOp0, Opcode, CCValid))
+      return getIntrinsicCmp(DAG, Opcode, CmpOp0, CCValid, Constant, Cond);
+  }
   Comparison C(CmpOp0, CmpOp1);
   C.CCMask = CCMaskForCondCode(Cond);
   if (C.Op0.getValueType().isFloatingPoint()) {
@@ -1632,6 +1727,17 @@ static Comparison getCmp(SelectionDAG &D
 
 // Emit the comparison instruction described by C.
 static SDValue emitCmp(SelectionDAG &DAG, SDLoc DL, Comparison &C) {
+  if (!C.Op1.getNode()) {
+    SDValue Op;
+    switch (C.Op0.getOpcode()) {
+    case ISD::INTRINSIC_W_CHAIN:
+      Op = emitIntrinsicWithChainAndGlue(DAG, C.Op0, C.Opcode);
+      break;
+    default:
+      llvm_unreachable("Invalid comparison operands");
+    }
+    return SDValue(Op.getNode(), Op->getNumValues() - 1);
+  }
   if (C.Opcode == SystemZISD::ICMP)
     return DAG.getNode(SystemZISD::ICMP, DL, MVT::Glue, C.Op0, C.Op1,
                        DAG.getConstant(C.ICmpType, MVT::i32));
@@ -1713,7 +1819,6 @@ SDValue SystemZTargetLowering::lowerSETC
 }
 
 SDValue SystemZTargetLowering::lowerBR_CC(SDValue Op, SelectionDAG &DAG) const {
-  SDValue Chain    = Op.getOperand(0);
   ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(1))->get();
   SDValue CmpOp0   = Op.getOperand(2);
   SDValue CmpOp1   = Op.getOperand(3);
@@ -1723,7 +1828,7 @@ SDValue SystemZTargetLowering::lowerBR_C
   Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC));
   SDValue Glue = emitCmp(DAG, DL, C);
   return DAG.getNode(SystemZISD::BR_CCMASK, DL, Op.getValueType(),
-                     Chain, DAG.getConstant(C.CCValid, MVT::i32),
+                     Op.getOperand(0), DAG.getConstant(C.CCValid, MVT::i32),
                      DAG.getConstant(C.CCMask, MVT::i32), Dest, Glue);
 }
 
@@ -2561,6 +2666,30 @@ SDValue SystemZTargetLowering::lowerPREF
                                  Node->getMemoryVT(), Node->getMemOperand());
 }
 
+// Return an i32 that contains the value of CC immediately after After,
+// whose final operand must be MVT::Glue.
+static SDValue getCCResult(SelectionDAG &DAG, SDNode *After) {
+  SDValue Glue = SDValue(After, After->getNumValues() - 1);
+  SDValue IPM = DAG.getNode(SystemZISD::IPM, SDLoc(After), MVT::i32, Glue);
+  return DAG.getNode(ISD::SRL, SDLoc(After), MVT::i32, IPM,
+                     DAG.getConstant(SystemZ::IPM_CC, MVT::i32));
+}
+
+SDValue
+SystemZTargetLowering::lowerINTRINSIC_W_CHAIN(SDValue Op,
+                                              SelectionDAG &DAG) const {
+  unsigned Opcode, CCValid;
+  if (isIntrinsicWithCCAndChain(Op, Opcode, CCValid)) {
+    assert(Op->getNumValues() == 2 && "Expected only CC result and chain");
+    SDValue Glued = emitIntrinsicWithChainAndGlue(DAG, Op, Opcode);
+    SDValue CC = getCCResult(DAG, Glued.getNode());
+    DAG.ReplaceAllUsesOfValueWith(SDValue(Op.getNode(), 0), CC);
+    return SDValue();
+  }
+
+  return SDValue();
+}
+
 SDValue SystemZTargetLowering::LowerOperation(SDValue Op,
                                               SelectionDAG &DAG) const {
   switch (Op.getOpcode()) {
@@ -2634,6 +2763,8 @@ SDValue SystemZTargetLowering::LowerOper
     return lowerSTACKRESTORE(Op, DAG);
   case ISD::PREFETCH:
     return lowerPREFETCH(Op, DAG);
+  case ISD::INTRINSIC_W_CHAIN:
+    return lowerINTRINSIC_W_CHAIN(Op, DAG);
   default:
     llvm_unreachable("Unexpected node to lower");
   }
@@ -2674,6 +2805,9 @@ const char *SystemZTargetLowering::getTa
     OPCODE(SEARCH_STRING);
     OPCODE(IPM);
     OPCODE(SERIALIZE);
+    OPCODE(TBEGIN);
+    OPCODE(TBEGIN_NOFLOAT);
+    OPCODE(TEND);
     OPCODE(ATOMIC_SWAPW);
     OPCODE(ATOMIC_LOADW_ADD);
     OPCODE(ATOMIC_LOADW_SUB);
@@ -3501,6 +3635,50 @@ SystemZTargetLowering::emitStringWrapper
   return DoneMBB;
 }
 
+// Update TBEGIN instruction with final opcode and register clobbers.
+MachineBasicBlock *
+SystemZTargetLowering::emitTransactionBegin(MachineInstr *MI,
+                                            MachineBasicBlock *MBB,
+                                            unsigned Opcode,
+                                            bool NoFloat) const {
+  MachineFunction &MF = *MBB->getParent();
+  const TargetFrameLowering *TFI = Subtarget.getFrameLowering();
+  const SystemZInstrInfo *TII = Subtarget.getInstrInfo();
+
+  // Update opcode.
+  MI->setDesc(TII->get(Opcode));
+
+  // We cannot handle a TBEGIN that clobbers the stack or frame pointer.
+  // Make sure to add the corresponding GRSM bits if they are missing.
+  uint64_t Control = MI->getOperand(2).getImm();
+  static const unsigned GPRControlBit[16] = {
+    0x8000, 0x8000, 0x4000, 0x4000, 0x2000, 0x2000, 0x1000, 0x1000,
+    0x0800, 0x0800, 0x0400, 0x0400, 0x0200, 0x0200, 0x0100, 0x0100
+  };
+  Control |= GPRControlBit[15];
+  if (TFI->hasFP(MF))
+    Control |= GPRControlBit[11];
+  MI->getOperand(2).setImm(Control);
+
+  // Add GPR clobbers.
+  for (int I = 0; I < 16; I++) {
+    if ((Control & GPRControlBit[I]) == 0) {
+      unsigned Reg = SystemZMC::GR64Regs[I];
+      MI->addOperand(MachineOperand::CreateReg(Reg, true, true));
+    }
+  }
+
+  // Add FPR clobbers.
+  if (!NoFloat && (Control & 4) != 0) {
+    for (int I = 0; I < 16; I++) {
+      unsigned Reg = SystemZMC::FP64Regs[I];
+      MI->addOperand(MachineOperand::CreateReg(Reg, true, true));
+    }
+  }
+
+  return MBB;
+}
+
 MachineBasicBlock *SystemZTargetLowering::
 EmitInstrWithCustomInserter(MachineInstr *MI, MachineBasicBlock *MBB) const {
   switch (MI->getOpcode()) {
@@ -3742,6 +3920,12 @@ EmitInstrWithCustomInserter(MachineInstr
     return emitStringWrapper(MI, MBB, SystemZ::MVST);
   case SystemZ::SRSTLoop:
     return emitStringWrapper(MI, MBB, SystemZ::SRST);
+  case SystemZ::TBEGIN:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, false);
+  case SystemZ::TBEGIN_nofloat:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, true);
+  case SystemZ::TBEGINC:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGINC, true);
   default:
     llvm_unreachable("Unexpected instr type to insert");
   }
Index: llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll
===================================================================
--- /dev/null
+++ llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll
@@ -0,0 +1,352 @@
+; Test transactional-execution intrinsics.
+;
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=zEC12 | FileCheck %s
+
+declare i32 @llvm.s390.tbegin(i8 *, i32)
+declare i32 @llvm.s390.tbegin.nofloat(i8 *, i32)
+declare void @llvm.s390.tbeginc(i8 *, i32)
+declare i32 @llvm.s390.tend()
+declare void @llvm.s390.tabort(i64)
+declare void @llvm.s390.ntstg(i64, i64 *)
+declare i32 @llvm.s390.etnd()
+declare void @llvm.s390.ppa.txassist(i32)
+
+; TBEGIN.
+define void @test_tbegin() {
+; CHECK-LABEL: test_tbegin:
+; CHECK-NOT: stmg
+; CHECK: std %f8,
+; CHECK: std %f9,
+; CHECK: std %f10,
+; CHECK: std %f11,
+; CHECK: std %f12,
+; CHECK: std %f13,
+; CHECK: std %f14,
+; CHECK: std %f15,
+; CHECK: tbegin 0, 65292
+; CHECK: ld %f8,
+; CHECK: ld %f9,
+; CHECK: ld %f10,
+; CHECK: ld %f11,
+; CHECK: ld %f12,
+; CHECK: ld %f13,
+; CHECK: ld %f14,
+; CHECK: ld %f15,
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin(i8 *null, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat).
+define void @test_tbegin_nofloat1() {
+; CHECK-LABEL: test_tbegin_nofloat1:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat) with integer CC return value.
+define i32 @test_tbegin_nofloat2() {
+; CHECK-LABEL: test_tbegin_nofloat2:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  ret i32 %res
+}
+
+; TBEGIN (nofloat) with implicit CC check.
+define void @test_tbegin_nofloat3(i32 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat3:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: jnh  {{\.L*}}
+; CHECK: mvhi 0(%r2), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void
+}
+
+; TBEGIN (nofloat) with dual CC use.
+define i32 @test_tbegin_nofloat4(i32 %pad, i32 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat4:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: cijlh %r2, 2,  {{\.L*}}
+; CHECK: mvhi 0(%r3), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret i32 %res
+}
+
+; TBEGIN (nofloat) with register.
+define void @test_tbegin_nofloat5(i8 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat5:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0(%r2), 65292
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *%ptr, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0x0f00.
+define void @test_tbegin_nofloat6() {
+; CHECK-LABEL: test_tbegin_nofloat6:
+; CHECK: stmg %r6, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 3840
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 3840)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xf100.
+define void @test_tbegin_nofloat7() {
+; CHECK-LABEL: test_tbegin_nofloat7:
+; CHECK: stmg %r8, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 61696
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 61696)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfe00 -- stack pointer added automatically.
+define void @test_tbegin_nofloat8() {
+; CHECK-LABEL: test_tbegin_nofloat8:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65280
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65024)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfb00 -- no frame pointer needed.
+define void @test_tbegin_nofloat9() {
+; CHECK-LABEL: test_tbegin_nofloat9:
+; CHECK: stmg %r10, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 64256
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 64256)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfb00 -- frame pointer added automatically.
+define void @test_tbegin_nofloat10(i64 %n) {
+; CHECK-LABEL: test_tbegin_nofloat10:
+; CHECK: stmg %r11, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65280
+; CHECK: br %r14
+  %buf = alloca i8, i64 %n
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 64256)
+  ret void
+}
+
+; TBEGINC.
+define void @test_tbeginc() {
+; CHECK-LABEL: test_tbeginc:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbeginc 0, 65288
+; CHECK: br %r14
+  call void @llvm.s390.tbeginc(i8 *null, i32 65288)
+  ret void
+}
+
+; TEND with integer CC return value.
+define i32 @test_tend1() {
+; CHECK-LABEL: test_tend1:
+; CHECK: tend
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  ret i32 %res
+}
+
+; TEND with implicit CC check.
+define void @test_tend3(i32 *%ptr) {
+; CHECK-LABEL: test_tend3:
+; CHECK: tend
+; CHECK: je  {{\.L*}}
+; CHECK: mvhi 0(%r2), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void
+}
+
+; TEND with dual CC use.
+define i32 @test_tend2(i32 %pad, i32 *%ptr) {
+; CHECK-LABEL: test_tend2:
+; CHECK: tend
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: cijlh %r2, 2,  {{\.L*}}
+; CHECK: mvhi 0(%r3), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret i32 %res
+}
+
+; TABORT with register only.
+define void @test_tabort1(i64 %val) {
+; CHECK-LABEL: test_tabort1:
+; CHECK: tabort 0(%r2)
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 %val)
+  ret void
+}
+
+; TABORT with immediate only.
+define void @test_tabort2(i64 %val) {
+; CHECK-LABEL: test_tabort2:
+; CHECK: tabort 1234
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 1234)
+  ret void
+}
+
+; TABORT with register + immediate.
+define void @test_tabort3(i64 %val) {
+; CHECK-LABEL: test_tabort3:
+; CHECK: tabort 1234(%r2)
+; CHECK: br %r14
+  %sum = add i64 %val, 1234
+  call void @llvm.s390.tabort(i64 %sum)
+  ret void
+}
+
+; TABORT with out-of-range immediate.
+define void @test_tabort4(i64 %val) {
+; CHECK-LABEL: test_tabort4:
+; CHECK: tabort 0({{%r[1-5]}})
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 4096)
+  ret void
+}
+
+; NTSTG with base pointer only.
+define void @test_ntstg1(i64 *%ptr, i64 %val) {
+; CHECK-LABEL: test_ntstg1:
+; CHECK: ntstg %r3, 0(%r2)
+; CHECK: br %r14
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with base and index.
+; Check that VSTL doesn't allow an index.
+define void @test_ntstg2(i64 *%base, i64 %index, i64 %val) {
+; CHECK-LABEL: test_ntstg2:
+; CHECK: sllg [[REG:%r[1-5]]], %r3, 3
+; CHECK: ntstg %r4, 0([[REG]],%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 %index
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with the highest in-range displacement.
+define void @test_ntstg3(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg3:
+; CHECK: ntstg %r3, 524280(%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 65535
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with an out-of-range positive displacement.
+define void @test_ntstg4(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg4:
+; CHECK: ntstg %r3, 0({{%r[1-5]}})
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 65536
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with the lowest in-range displacement.
+define void @test_ntstg5(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg5:
+; CHECK: ntstg %r3, -524288(%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 -65536
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with an out-of-range negative displacement.
+define void @test_ntstg6(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg6:
+; CHECK: ntstg %r3, 0({{%r[1-5]}})
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 -65537
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; ETND.
+define i32 @test_etnd() {
+; CHECK-LABEL: test_etnd:
+; CHECK: etnd %r2
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.etnd()
+  ret i32 %res
+}
+
+; PPA (Transaction-Abort Assist)
+define void @test_ppa_txassist(i32 %val) {
+; CHECK-LABEL: test_ppa_txassist:
+; CHECK: ppa %r2, 0, 1
+; CHECK: br %r14
+  call void @llvm.s390.ppa.txassist(i32 %val)
+  ret void
+}
+
Index: llvm-head/test/MC/SystemZ/insn-bad-zEC12.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-bad-zEC12.s
+++ llvm-head/test/MC/SystemZ/insn-bad-zEC12.s
@@ -3,6 +3,22 @@
 # RUN: FileCheck < %t %s
 
 #CHECK: error: invalid operand
+#CHECK: ntstg	%r0, -524289
+#CHECK: error: invalid operand
+#CHECK: ntstg	%r0, 524288
+
+	ntstg	%r0, -524289
+	ntstg	%r0, 524288
+
+#CHECK: error: invalid operand
+#CHECK: ppa	%r0, %r0, -1
+#CHECK: error: invalid operand
+#CHECK: ppa	%r0, %r0, 16
+
+	ppa	%r0, %r0, -1
+	ppa	%r0, %r0, 16
+
+#CHECK: error: invalid operand
 #CHECK: risbgn	%r0,%r0,0,0,-1
 #CHECK: error: invalid operand
 #CHECK: risbgn	%r0,%r0,0,0,64
@@ -22,3 +38,47 @@
 	risbgn	%r0,%r0,-1,0,0
 	risbgn	%r0,%r0,256,0,0
 
+#CHECK: error: invalid operand
+#CHECK: tabort	-1
+#CHECK: error: invalid operand
+#CHECK: tabort	4096
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tabort	0(%r1,%r2)
+
+	tabort	-1
+	tabort	4096
+	tabort	0(%r1,%r2)
+
+#CHECK: error: invalid operand
+#CHECK: tbegin	-1, 0
+#CHECK: error: invalid operand
+#CHECK: tbegin	4096, 0
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tbegin	0(%r1,%r2), 0
+#CHECK: error: invalid operand
+#CHECK: tbegin	0, -1
+#CHECK: error: invalid operand
+#CHECK: tbegin	0, 65536
+
+	tbegin	-1, 0
+	tbegin	4096, 0
+	tbegin	0(%r1,%r2), 0
+	tbegin	0, -1
+	tbegin	0, 65536
+
+#CHECK: error: invalid operand
+#CHECK: tbeginc	-1, 0
+#CHECK: error: invalid operand
+#CHECK: tbeginc	4096, 0
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tbeginc	0(%r1,%r2), 0
+#CHECK: error: invalid operand
+#CHECK: tbeginc	0, -1
+#CHECK: error: invalid operand
+#CHECK: tbeginc	0, 65536
+
+	tbeginc	-1, 0
+	tbeginc	4096, 0
+	tbeginc	0(%r1,%r2), 0
+	tbeginc	0, -1
+	tbeginc	0, 65536
Index: llvm-head/test/MC/SystemZ/insn-good-zEC12.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-good-zEC12.s
+++ llvm-head/test/MC/SystemZ/insn-good-zEC12.s
@@ -1,6 +1,48 @@
 # For zEC12 and above.
 # RUN: llvm-mc -triple s390x-linux-gnu -mcpu=zEC12 -show-encoding %s | FileCheck %s
 
+#CHECK: etnd	%r0                     # encoding: [0xb2,0xec,0x00,0x00]
+#CHECK: etnd	%r15                    # encoding: [0xb2,0xec,0x00,0xf0]
+#CHECK: etnd	%r7                     # encoding: [0xb2,0xec,0x00,0x70]
+
+	etnd	%r0
+	etnd	%r15
+	etnd	%r7
+
+#CHECK: ntstg	%r0, -524288            # encoding: [0xe3,0x00,0x00,0x00,0x80,0x25]
+#CHECK: ntstg	%r0, -1                 # encoding: [0xe3,0x00,0x0f,0xff,0xff,0x25]
+#CHECK: ntstg	%r0, 0                  # encoding: [0xe3,0x00,0x00,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 1                  # encoding: [0xe3,0x00,0x00,0x01,0x00,0x25]
+#CHECK: ntstg	%r0, 524287             # encoding: [0xe3,0x00,0x0f,0xff,0x7f,0x25]
+#CHECK: ntstg	%r0, 0(%r1)             # encoding: [0xe3,0x00,0x10,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 0(%r15)            # encoding: [0xe3,0x00,0xf0,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 524287(%r1,%r15)   # encoding: [0xe3,0x01,0xff,0xff,0x7f,0x25]
+#CHECK: ntstg	%r0, 524287(%r15,%r1)   # encoding: [0xe3,0x0f,0x1f,0xff,0x7f,0x25]
+#CHECK: ntstg	%r15, 0                 # encoding: [0xe3,0xf0,0x00,0x00,0x00,0x25]
+
+	ntstg	%r0, -524288
+	ntstg	%r0, -1
+	ntstg	%r0, 0
+	ntstg	%r0, 1
+	ntstg	%r0, 524287
+	ntstg	%r0, 0(%r1)
+	ntstg	%r0, 0(%r15)
+	ntstg	%r0, 524287(%r1,%r15)
+	ntstg	%r0, 524287(%r15,%r1)
+	ntstg	%r15, 0
+
+#CHECK: ppa	%r0, %r0, 0             # encoding: [0xb2,0xe8,0x00,0x00]
+#CHECK: ppa	%r0, %r0, 15            # encoding: [0xb2,0xe8,0xf0,0x00]
+#CHECK: ppa	%r0, %r15, 0            # encoding: [0xb2,0xe8,0x00,0x0f]
+#CHECK: ppa	%r4, %r6, 7             # encoding: [0xb2,0xe8,0x70,0x46]
+#CHECK: ppa	%r15, %r0, 0            # encoding: [0xb2,0xe8,0x00,0xf0]
+
+	ppa	%r0, %r0, 0
+	ppa	%r0, %r0, 15
+	ppa	%r0, %r15, 0
+	ppa	%r4, %r6, 7
+	ppa	%r15, %r0, 0
+
 #CHECK: risbgn	%r0, %r0, 0, 0, 0       # encoding: [0xec,0x00,0x00,0x00,0x00,0x59]
 #CHECK: risbgn	%r0, %r0, 0, 0, 63      # encoding: [0xec,0x00,0x00,0x00,0x3f,0x59]
 #CHECK: risbgn	%r0, %r0, 0, 255, 0     # encoding: [0xec,0x00,0x00,0xff,0x00,0x59]
@@ -17,3 +59,68 @@
 	risbgn	%r15,%r0,0,0,0
 	risbgn	%r4,%r5,6,7,8
 
+#CHECK: tabort	0                       # encoding: [0xb2,0xfc,0x00,0x00]
+#CHECK: tabort	0(%r1)                  # encoding: [0xb2,0xfc,0x10,0x00]
+#CHECK: tabort	0(%r15)                 # encoding: [0xb2,0xfc,0xf0,0x00]
+#CHECK: tabort	4095                    # encoding: [0xb2,0xfc,0x0f,0xff]
+#CHECK: tabort	4095(%r1)               # encoding: [0xb2,0xfc,0x1f,0xff]
+#CHECK: tabort	4095(%r15)              # encoding: [0xb2,0xfc,0xff,0xff]
+
+	tabort	0
+	tabort	0(%r1)
+	tabort	0(%r15)
+	tabort	4095
+	tabort	4095(%r1)
+	tabort	4095(%r15)
+
+#CHECK: tbegin	0, 0                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00]
+#CHECK: tbegin	4095, 0                 # encoding: [0xe5,0x60,0x0f,0xff,0x00,0x00]
+#CHECK: tbegin	0, 0                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00]
+#CHECK: tbegin	0, 1                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x01]
+#CHECK: tbegin	0, 32767                # encoding: [0xe5,0x60,0x00,0x00,0x7f,0xff]
+#CHECK: tbegin	0, 32768                # encoding: [0xe5,0x60,0x00,0x00,0x80,0x00]
+#CHECK: tbegin	0, 65535                # encoding: [0xe5,0x60,0x00,0x00,0xff,0xff]
+#CHECK: tbegin	0(%r1), 42              # encoding: [0xe5,0x60,0x10,0x00,0x00,0x2a]
+#CHECK: tbegin	0(%r15), 42             # encoding: [0xe5,0x60,0xf0,0x00,0x00,0x2a]
+#CHECK: tbegin	4095(%r1), 42           # encoding: [0xe5,0x60,0x1f,0xff,0x00,0x2a]
+#CHECK: tbegin	4095(%r15), 42          # encoding: [0xe5,0x60,0xff,0xff,0x00,0x2a]
+
+	tbegin	0, 0
+	tbegin	4095, 0
+	tbegin	0, 0
+	tbegin	0, 1
+	tbegin	0, 32767
+	tbegin	0, 32768
+	tbegin	0, 65535
+	tbegin	0(%r1), 42
+	tbegin	0(%r15), 42
+	tbegin	4095(%r1), 42
+	tbegin	4095(%r15), 42
+
+#CHECK: tbeginc	0, 0                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00]
+#CHECK: tbeginc	4095, 0                 # encoding: [0xe5,0x61,0x0f,0xff,0x00,0x00]
+#CHECK: tbeginc	0, 0                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00]
+#CHECK: tbeginc	0, 1                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x01]
+#CHECK: tbeginc	0, 32767                # encoding: [0xe5,0x61,0x00,0x00,0x7f,0xff]
+#CHECK: tbeginc	0, 32768                # encoding: [0xe5,0x61,0x00,0x00,0x80,0x00]
+#CHECK: tbeginc	0, 65535                # encoding: [0xe5,0x61,0x00,0x00,0xff,0xff]
+#CHECK: tbeginc	0(%r1), 42              # encoding: [0xe5,0x61,0x10,0x00,0x00,0x2a]
+#CHECK: tbeginc	0(%r15), 42             # encoding: [0xe5,0x61,0xf0,0x00,0x00,0x2a]
+#CHECK: tbeginc	4095(%r1), 42           # encoding: [0xe5,0x61,0x1f,0xff,0x00,0x2a]
+#CHECK: tbeginc	4095(%r15), 42          # encoding: [0xe5,0x61,0xff,0xff,0x00,0x2a]
+
+	tbeginc	0, 0
+	tbeginc	4095, 0
+	tbeginc	0, 0
+	tbeginc	0, 1
+	tbeginc	0, 32767
+	tbeginc	0, 32768
+	tbeginc	0, 65535
+	tbeginc	0(%r1), 42
+	tbeginc	0(%r15), 42
+	tbeginc	4095(%r1), 42
+	tbeginc	4095(%r15), 42
+
+#CHECK: tend                            # encoding: [0xb2,0xf8,0x00,0x00]
+
+	tend
Index: llvm-head/test/MC/SystemZ/insn-bad-z196.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-bad-z196.s
+++ llvm-head/test/MC/SystemZ/insn-bad-z196.s
@@ -244,6 +244,11 @@
 	cxlgbr	%f0, 16, %r0, 0
 	cxlgbr	%f2, 0, %r0, 0
 
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: etnd	%r7
+
+	etnd	%r7
+
 #CHECK: error: invalid operand
 #CHECK: fidbra	%f0, 0, %f0, -1
 #CHECK: error: invalid operand
@@ -546,6 +551,16 @@
 	locr	%r0,%r0,-1
 	locr	%r0,%r0,16
 
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: ntstg	%r0, 524287(%r1,%r15)
+
+	ntstg	%r0, 524287(%r1,%r15)
+
+#CHECK: error: {{(instruction requires: processor-assist)?}}
+#CHECK: ppa	%r4, %r6, 7
+
+	ppa	%r4, %r6, 7
+
 #CHECK: error: {{(instruction requires: miscellaneous-extensions)?}}
 #CHECK: risbgn	%r1, %r2, 0, 0, 0
 
@@ -690,3 +705,24 @@
 	stocg	%r0,-524289,1
 	stocg	%r0,524288,1
 	stocg	%r0,0(%r1,%r2),1
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tabort	4095(%r1)
+
+	tabort	4095(%r1)
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tbegin	4095(%r1), 42
+
+	tbegin	4095(%r1), 42
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tbeginc	4095(%r1), 42
+
+	tbeginc	4095(%r1), 42
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tend
+
+	tend
+
Index: llvm-head/test/MC/Disassembler/SystemZ/insns.txt
===================================================================
--- llvm-head.orig/test/MC/Disassembler/SystemZ/insns.txt
+++ llvm-head/test/MC/Disassembler/SystemZ/insns.txt
@@ -2503,6 +2503,15 @@
 # CHECK: ear %r15, %a15
 0xb2 0x4f 0x00 0xff
 
+# CHECK: etnd %r0
+0xb2 0xec 0x00 0x00
+
+# CHECK: etnd %r15
+0xb2 0xec 0x00 0xf0
+
+# CHECK: etnd %r7
+0xb2 0xec 0x00 0x70
+
 # CHECK: fidbr %f0, 0, %f0
 0xb3 0x5f 0x00 0x00
 
@@ -6034,6 +6043,36 @@
 # CHECK: ny %r15, 0
 0xe3 0xf0 0x00 0x00 0x00 0x54
 
+# CHECK: ntstg %r0, -524288
+0xe3 0x00 0x00 0x00 0x80 0x25
+
+# CHECK: ntstg %r0, -1
+0xe3 0x00 0x0f 0xff 0xff 0x25
+
+# CHECK: ntstg %r0, 0
+0xe3 0x00 0x00 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 1
+0xe3 0x00 0x00 0x01 0x00 0x25
+
+# CHECK: ntstg %r0, 524287
+0xe3 0x00 0x0f 0xff 0x7f 0x25
+
+# CHECK: ntstg %r0, 0(%r1)
+0xe3 0x00 0x10 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 0(%r15)
+0xe3 0x00 0xf0 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 524287(%r1,%r15)
+0xe3 0x01 0xff 0xff 0x7f 0x25
+
+# CHECK: ntstg %r0, 524287(%r15,%r1)
+0xe3 0x0f 0x1f 0xff 0x7f 0x25
+
+# CHECK: ntstg %r15, 0
+0xe3 0xf0 0x00 0x00 0x00 0x25
+
 # CHECK: oc 0(1), 0
 0xd6 0x00 0x00 0x00 0x00 0x00
 
@@ -6346,6 +6385,21 @@
 # CHECK: popcnt %r7, %r8
 0xb9 0xe1 0x00 0x78
 
+# CHECK: ppa %r0, %r0, 0
+0xb2 0xe8 0x00 0x00
+
+# CHECK: ppa %r0, %r0, 15
+0xb2 0xe8 0xf0 0x00
+
+# CHECK: ppa %r0, %r15, 0
+0xb2 0xe8 0x00 0x0f
+
+# CHECK: ppa %r4, %r6, 7
+0xb2 0xe8 0x70 0x46
+
+# CHECK: ppa %r15, %r0, 0
+0xb2 0xe8 0x00 0xf0
+
 # CHECK: risbg %r0, %r0, 0, 0, 0
 0xec 0x00 0x00 0x00 0x00 0x55
 
@@ -8062,6 +8116,93 @@
 # CHECK: sy %r15, 0
 0xe3 0xf0 0x00 0x00 0x00 0x5b
 
+# CHECK: tabort 0
+0xb2 0xfc 0x00 0x00
+
+# CHECK: tabort 0(%r1)
+0xb2 0xfc 0x10 0x00
+
+# CHECK: tabort 0(%r15)
+0xb2 0xfc 0xf0 0x00
+
+# CHECK: tabort 4095
+0xb2 0xfc 0x0f 0xff
+
+# CHECK: tabort 4095(%r1)
+0xb2 0xfc 0x1f 0xff
+
+# CHECK: tabort 4095(%r15)
+0xb2 0xfc 0xff 0xff
+
+# CHECK: tbegin 0, 0
+0xe5 0x60 0x00 0x00 0x00 0x00
+
+# CHECK: tbegin 4095, 0
+0xe5 0x60 0x0f 0xff 0x00 0x00
+
+# CHECK: tbegin 0, 0
+0xe5 0x60 0x00 0x00 0x00 0x00
+
+# CHECK: tbegin 0, 1
+0xe5 0x60 0x00 0x00 0x00 0x01
+
+# CHECK: tbegin 0, 32767
+0xe5 0x60 0x00 0x00 0x7f 0xff
+
+# CHECK: tbegin 0, 32768
+0xe5 0x60 0x00 0x00 0x80 0x00
+
+# CHECK: tbegin 0, 65535
+0xe5 0x60 0x00 0x00 0xff 0xff
+
+# CHECK: tbegin 0(%r1), 42
+0xe5 0x60 0x10 0x00 0x00 0x2a
+
+# CHECK: tbegin 0(%r15), 42
+0xe5 0x60 0xf0 0x00 0x00 0x2a
+
+# CHECK: tbegin 4095(%r1), 42
+0xe5 0x60 0x1f 0xff 0x00 0x2a
+
+# CHECK: tbegin 4095(%r15), 42
+0xe5 0x60 0xff 0xff 0x00 0x2a
+
+# CHECK: tbeginc 0, 0
+0xe5 0x61 0x00 0x00 0x00 0x00
+
+# CHECK: tbeginc 4095, 0
+0xe5 0x61 0x0f 0xff 0x00 0x00
+
+# CHECK: tbeginc 0, 0
+0xe5 0x61 0x00 0x00 0x00 0x00
+
+# CHECK: tbeginc 0, 1
+0xe5 0x61 0x00 0x00 0x00 0x01
+
+# CHECK: tbeginc 0, 32767
+0xe5 0x61 0x00 0x00 0x7f 0xff
+
+# CHECK: tbeginc 0, 32768
+0xe5 0x61 0x00 0x00 0x80 0x00
+
+# CHECK: tbeginc 0, 65535
+0xe5 0x61 0x00 0x00 0xff 0xff
+
+# CHECK: tbeginc 0(%r1), 42
+0xe5 0x61 0x10 0x00 0x00 0x2a
+
+# CHECK: tbeginc 0(%r15), 42
+0xe5 0x61 0xf0 0x00 0x00 0x2a
+
+# CHECK: tbeginc 4095(%r1), 42
+0xe5 0x61 0x1f 0xff 0x00 0x2a
+
+# CHECK: tbeginc 4095(%r15), 42
+0xe5 0x61 0xff 0xff 0x00 0x2a
+
+# CHECK: tend
+0xb2 0xf8 0x00 0x00
+
 # CHECK: tm 0, 0
 0x91 0x00 0x00 0x00
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233803 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 12:51:43 +00:00
Hal Finkel
54ab6ce385 [PowerPC] FastISel can't handle i1 return values when using CR bits
Under normal circumstances, use of CR bits is disabled when running at -O0, but
it is enabled by default otherwise, and if you have optnone functions, they'll
still generally be generated with crbits turned on (because nothing else turns
them off). FastISel can't handle most things dealing with i1 values when using
CR bits, and checks for that, but was not checking the return type on
functions; we can't fast-isel function calls with i1 return values either when
using CR bits for boolean values.

Fixes PR22664.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233775 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 00:40:48 +00:00
David Majnemer
64386621ec [WinEH] Generate .xdata for catch handlers
This lets us catch exceptions in simple cases.

N.B. Things that do not work include (but are not limited to):
- Throwing from within a catch handler.
- Catching an object with a named catch parameter.
- 'CatchHigh' is fictitious, we aren't sure of its purpose.
- We aren't entirely efficient with regards to the number of EH states
  that we generate.
- IP-to-State tables are sensitive to the order of emission.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233767 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 22:35:44 +00:00