Commit Graph

29748 Commits

Author SHA1 Message Date
Nick Lewycky
d4b4e3e20f Subtraction is not commutative. Fixes PR23212!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234780 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 19:17:37 +00:00
Krzysztof Parzyszek
b6852d12a8 Remove this test until I figure out why it fails
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234777 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:57:50 +00:00
Duncan P. N. Exon Smith
88116fe71a Reapply "Verifier: Check for incompatible bit piece expressions"
This reverts commit r234717, reapplying r234698 (in spirit).

As described in r234717, the original `Verifier` check had a
use-after-free.  Instead of storing pointers to "interesting" debug info
intrinsics whose bit piece expressions should be verified once we have
typerefs, do a second traversal.  I've added a testcase to catch the
`llc` crasher.

Original commit message:

    Verifier: Check for incompatible bit piece expressions

    Convert an assertion into a `Verifier` check.  Bit piece expressions
    must fit inside the variable, and mustn't be the entire variable.
    Catching this in the verifier will help us find bugs sooner, and makes
    `DIVariable::getSizeInBits()` dead code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234776 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:53:11 +00:00
Matthias Braun
fabde10209 Use FileCheck for test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234774 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:47:19 +00:00
Akira Hatanaka
a50a335767 [inliner] Don't inline a function if it doesn't have exactly the same
target-cpu and target-features attribute strings as the caller.

Differential Revision: http://reviews.llvm.org/D8984


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234773 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:43:38 +00:00
Krzysztof Parzyszek
75f1ab4b1b Make the ARM testcase from r234764 also pass on Thumb
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234772 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 18:39:52 +00:00
Jan Vesely
a017ce21ba Revert revisions r234755, r234759, r234760
Revert "Remove default in fully-covered switch (to fix Clang -Werror -Wcovered-switch-default)"
Revert "R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO"
Revert "LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB"

Using overflow operations fails CodeGen/Generic/2011-07-07-ScheduleDAGCrash.ll
on hexagon, nvptx, and r600. Revert while I investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234768 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:47:15 +00:00
Krzysztof Parzyszek
fcc330abfe Allow memory intrinsics to be tail calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:16:45 +00:00
Matthias Braun
12a7039644 DAGCombiner: Fix crash in select(select) opt.
In case of different types used for the condition of the selects the
select(select) -> select(and) normalisation cannot be performed.

See also: http://reviews.llvm.org/D7622

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234763 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 17:16:33 +00:00
Jan Vesely
4ce8b9c7fb R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO
v2: tighten the sub64 tests
v3: rename to CARRY/BORROW
v4: fixup test cmdline
    add known bits computation
    use sign extend instead of sub 0,x
    better add test
v5: remove redundant break
    move lowering to separate functions
    fix comments

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewers: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234759 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 16:26:00 +00:00
David Blaikie
2d35a46ea5 llvm-readobj: teach it to handle MachO Universal Archive correctly
Patch by Chilledheart (rwindz0@gmail.com).

Reviewed By: rafael

Differential Revision: http://reviews.llvm.org/D8773

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234758 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 16:05:49 +00:00
Jan Vesely
187ac42686 LegalizeDAG: Try to use Overflow operations when expanding ADD/SUB
v2: consider BooleanContents when processing overflow

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewers: resistor, jholewinsky (nvidia parts)
Differential Revision: http://reviews.llvm.org/D6340

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 15:32:01 +00:00
John Brawn
afa4193fc8 [ARM] Align global variables passed to memory intrinsics
Fill in the TODO in CodeGenPrepare::OptimizeCallInst so that global
variables that are passed to memory intrinsics are aligned in the same
way that allocas are.

Differential Revision: http://reviews.llvm.org/D8421


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 10:47:39 +00:00
NAKAMURA Takumi
10a56a29ef llvm/test/CodeGen/R600/fminnum.ll: Relax an expression for NaN on MSVCRT like r204118.
<stdin>:202:2: note: possible intended match here
   2143289344(1.#QNAN0e+00), 2(2.802597e-45)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234719 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-13 04:54:06 +00:00
Jan Vesely
b7239d3aa5 R600: Make FMIN/MAXNUM legal on all asics
v2: Add tests

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
reviewer: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234716 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 23:45:05 +00:00
Petr Hosek
054db7df5b [MC] Write padding into fragments when -mc-relax-all flag is used
Summary:
When instruction bundling is enabled and the -mc-relax-all flag is
set, we can write bundle padding directly into fragments and avoid
creating large number of fragments significantly reducing LLVM MC
memory usage.

Test Plan: Regression test attached

Reviewers: eliben

Subscribers: jfb, mseaborn

Differential Revision: http://reviews.llvm.org/D8072

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234714 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 23:42:25 +00:00
Lang Hames
2ae6c65416 [Orc] During module partitioning, rename anonymous and asm-private globals.
If they're not (re)named, these globals will fail to resolve when the
partitioned modules are linked.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234707 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 20:05:51 +00:00
Hal Finkel
ad96655905 [PowerPC] Really iterate over all loops in PPCLoopDataPrefetch/PPCLoopPreIncPrep
When I fixed these a couple of days ago to iterate over all loops, not just
depth == 1 loops, I inadvertently made it such that we'd only look at the first
top-level loop. Make sure that we really look at all of them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234705 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 17:18:56 +00:00
Sanjoy Das
e0f4a11a89 [LoopUnrollRuntime] Clean up a predicate.
Clean up a predicate I added in r229731, fix the relevant comment and
add a test case.  The earlier version is confusing to read and was also
buggy (probably not a coincidence) till Alexey fixed it in r233881.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234701 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-12 01:24:01 +00:00
Hal Finkel
1cea397876 [PowerPC] Disable part-word atomics on the P7
As it turns out, even though these are part of ISA 2.06, the P7 does not
support them (or, at least, not any P7s we're tested so far).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 13:40:36 +00:00
Nemanja Ivanovic
9ca031b6c6 Add direct moves to/from VSR and exploit them for FP/INT conversions
This patch corresponds to review:
http://reviews.llvm.org/D8928

It adds direct move instructions to/from VSX registers to GPR's. These are
exploited for FP <-> INT conversions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234682 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 10:40:42 +00:00
Hal Finkel
ff49ef7838 [PowerPC] Fix PPCLoopPreIncPrep for depth > 1 loops
This pass had the same problem as the data-prefetching pass: it was only
checking for depth == 1 loops in practice. Fix that, add some debugging
statements, and make sure that, when we grab an AddRec, it is for the loop we
expect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234670 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 00:33:08 +00:00
Ahmed Bougacha
d2069333ee [CodeGen] Split -enable-global-merge into ARM and AArch64 options.
Currently, there's a single flag, checked by the pass itself.
It can't force-enable the pass (and is on by default), because it
might not even have been created, as that's the targets decision.
Instead, have separate explicit flags, so that the decision is
consistently made in the target.

Keep the flag as a last-resort "force-disable GlobalMerge" for now,
for backwards compatibility.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-11 00:06:36 +00:00
Reid Kleckner
c127686d0e [WinEH] Recognize SEH finally block inserted by the frontend
This allows winehprepare to build sensible llvm.eh.actions calls for SEH
finally blocks.  The pattern matching in this change is brittle and
should be replaced with something more robust soon.  In the meantime,
this will let us write the code that produces __C_specific_handler xdata
tables, which we need regardless of how we decide to get finally blocks
through EH preparation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234663 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 23:12:29 +00:00
Philip Reames
f894c7ecb2 [RewriteStatepointsForGC] test case missing from 234657
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234658 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 22:58:39 +00:00
Philip Reames
d92e9ef170 [RewriteStatepointsForGC] Use an actual liveness algorithm
When rewriting statepoints to make relocations explicit, we need to have a conservative but consistent notion of where a particular pointer is live at a particular site. The old code just used dominance, which is correct, but decidedly more conservative then it needed to be. This patch implements a simple dataflow algorithm that's run one per function (well, twice counting fixup after base pointer insertion). There's still lots of room to make this faster, but it's fast enough for all practical purposes today.

Differential Revision: http://reviews.llvm.org/D8674



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 22:53:14 +00:00
Philip Reames
82374f6b8a [RewriteStatepointsForGC] Preprocess the IR to remove unreachable blocks and single entry phis
Two related small changes:

    Various dominance based queries about liveness can get confused if we're talking about unreachable blocks. To avoid reasoning about such cases, just remove them before rewriting statepoints.
    Remove single entry phis (likely left behind by LCSSA) to reduce the number of live values.

Both of these are motivated by http://reviews.llvm.org/D8674 which will be submitted shortly.

Differential Revision: http://reviews.llvm.org/D8675



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234651 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 22:07:04 +00:00
Philip Reames
cb148c8541 [RewriteStatepointsForGC] Limited support for vectors of pointers
This patch adds limited support for inserting explicit relocations when there's a vector of pointers live over the statepoint. This doesn't handle the case where the vector contains a mix of base and non-base pointers; that's future work.

The current implementation just scalarizes the vector over the gc.statepoint before doing the explicit rewrite. An alternate approach would be to plumb the vector all the way though the backend lowering, but doing that appears challenging. In particular, the size of the indirect spill slot is currently assumed to be sizeof(pointer) throughout the backend.

In practice, this is enough to allow running the SLP and Loop vectorizers before RewriteStatepointsForGC.

Differential Revision: http://reviews.llvm.org/D8671



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234647 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 21:48:25 +00:00
Sanjoy Das
8aca90e5b6 [InstCombine][CodeGenPrep] Create llvm.uadd.with.overflow in CGP.
Summary:
This change moves creating calls to `llvm.uadd.with.overflow` from
InstCombine to CodeGenPrep.  Combining overflow check patterns into
calls to the said intrinsic in InstCombine inhibits optimization because
it introduces an intrinsic call that not all other transforms and
analyses understand.

Depends on D8888.

Reviewers: majnemer, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8889

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234638 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 21:07:09 +00:00
Reid Kleckner
79db0a6fd9 Avoid spewing binary to stdout in some filetype=obj tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 19:36:55 +00:00
Sanjay Patel
dc0ca89635 use update_llc_test_checks.py to tighten checking
test features, not CPUs

remove unnecessary cruft


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234622 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 18:31:42 +00:00
Reid Kleckner
10d1d8a3bd [WinEH] Try to make outlining invokes work a little better
WinEH currently turns invokes into calls. Long term, we will reconsider
this, but for now, make sure we remap the operands and clone the
successors of the new terminator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234608 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 16:26:42 +00:00
Hal Finkel
d4f643a5df [PowerPC] Prefetching should also consider depth > 1 loops
Iterating over loops from the LoopInfo instance only provides top-level loops.
We need to search the whole tree of loops to find the inner ones.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234603 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 15:05:02 +00:00
Toma Tabacu
bb11bbe297 [mips] [IAS] Make the mips-expansions-bad.s test more readable. NFC.
Move the check lines below the code lines and change the indentation from 8
spaces to 2 spaces.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 10:46:59 +00:00
Jingyue Wu
5733100450 Divergence analysis for GPU programs
Summary:
Some optimizations such as jump threading and loop unswitching can negatively
affect performance when applied to divergent branches. The divergence analysis
added in this patch conservatively estimates which branches in a GPU program
can diverge. This information can then help LLVM to run certain optimizations
selectively.

Test Plan: test/Analysis/DivergenceAnalysis/NVPTX/diverge.ll

Reviewers: resistor, hfinkel, eliben, meheff, jholewinski

Subscribers: broune, bjarke.roune, madhur13490, tstellarAMD, dberlin, echristo, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D8576

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234567 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 05:03:50 +00:00
Hal Finkel
36f934a207 [PowerPC] Don't crash on PPC32 i64 fp_to_uint on modern cores
When we have an instruction for this (and, thus, don't generate a runtime
call), we need to custom type legalize this (in a trivial way, just as we do
for fp_to_sint).

Fixes PR23173.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234561 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 03:39:00 +00:00
Ahmed Bougacha
1810ca3110 [AArch64] Promote f16 operations to f32.
For the most common ones (such as fadd), we already did the promotion.
Do the same thing for all the others.

Currently, we'll just crash/assert on all these operations, as
there's no hardware or libcall support whatsoever.

f16 (half) is specified as an interchange - not arithmetic - format,
and is expected to be promoted to single-precision for arithmetic
operations.

While there, teach the legalizer about promoting some of the (mostly
floating-point) operations that we never needed before.

Differential Revision: http://reviews.llvm.org/D8648
See related discussion on the thread for: http://reviews.llvm.org/D8755


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234550 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-10 00:08:48 +00:00
Nemanja Ivanovic
c58b8f0b65 Add LLVM support for remaining integer divide and permute instructions from ISA 2.06
This is the patch corresponding to review:
http://reviews.llvm.org/D8406

It adds some missing instructions from ISA 2.06 to the PPC back end.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234546 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 23:54:37 +00:00
Ahmed Bougacha
66649e00c9 [CodeGen] Combine concat_vector of trunc'd scalar to scalar_to_vector.
We already do:
  concat_vectors(scalar, undef) -> scalar_to_vector(scalar)
When the scalar is legal.
When it's not, but is a truncated legal scalar, we can also do:
  concat_vectors(trunc(scalar), undef) -> scalar_to_vector(scalar)
Which is equivalent, since the upper lanes are undef anyway.
While there, teach the combine to look at more than 2 operands.

Differential Revision: http://reviews.llvm.org/D8883


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234530 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 20:04:47 +00:00
Juergen Ributzka
117bf240ef [AArch64][FastISel] Fix integer extend optimization.
The integer extend optimization tries to fold the extend into the load
instruction. This requires us to identify if the extend has already been
emitted or not and act accordingly on it.

The check that was originally performed for this was not sufficient. Besides
checking the ValueMap for a mapped register we also need to check if the
virtual register has already an associated machine instruction that defines it.

This fixes rdar://problem/20470788.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 20:00:46 +00:00
Rafael Espindola
57a24199de Revert "Refactoring and enhancement to FMA combine."
This reverts commit r234513. It was failing on the bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234518 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 18:29:32 +00:00
Olivier Sallenave
ef67194fd2 Refactoring and enhancement to FMA combine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234513 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 17:55:26 +00:00
Javed Absar
28c2fda9df [ARM] support for Cortex-R4/R4F
Currently, llvm (backend) doesn't know cortex-r4, even though it is the
default target for armv7r. Using "--target=armv7r-arm-none-eabi" provokes
'cortex-r4' is not a recognized processor for this target' by llvm.
This patch adds support for cortex-r4 and, very closely related, r4f.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 14:07:28 +00:00
Kristof Beyls
2a6ad5bfb2 [AArch64] Add support for dynamic stack alignment
Differential Revision: http://reviews.llvm.org/D8876



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234471 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 08:49:47 +00:00
Lang Hames
64008ef318 [AArch64] Remove redundant -march option. Also fix a think-o from r234462.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 05:34:57 +00:00
Nick Lewycky
7ca40334f1 Not all triples put _ before function names. Specify a triple to make this test pass on Linux.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 05:31:32 +00:00
Lang Hames
174f04eefb [AArch64] Teach AArch64TargetLowering::getOptimalMemOpType to consider alignment
restrictions when choosing a type for small-memcpy inlining in
SelectionDAGBuilder.

This ensures that the loads and stores output for the memcpy won't be further
expanded during legalization, which would cause the total number of instructions
for the memcpy to exceed (often significantly) the inlining thresholds.

<rdar://problem/17829180>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-09 03:40:33 +00:00
Akira Hatanaka
2c3c562b03 Use option -march instead of -mtriple to avoid overconditionalizing the test.
This fixes r234439, which was committed to fix the test failures caused by
r234430.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 23:02:45 +00:00
Akira Hatanaka
0400513fd3 Pass -mtriple to llc to appease buildbot.
This fixes the test case I committed in r234430.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234439 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 21:30:48 +00:00
Andrew Kaylor
a9180a2fac [WinEH] Minor bug fixes.
Fixed insert point for allocas created for demoted values.
Clear the nested landing pad list after it has been processed.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 20:57:22 +00:00
Akira Hatanaka
522877813a [DAGCombine] Fix a bug in MergeConsecutiveStores.
The bug manifests when there are two loads and two stores chained as follows in
a DAG,

(ld v3f32) -> (st f32) -> (ld v3f32) -> (st f32)

and the stores' values are extracted from the preceding vector loads.

MergeConsecutiveStores would replace the first store in the chain with the
merged vector store, which would create a cycle between the merged store node
and the last load node that appears in the chain.

This commits fixes the bug by replacing the last store in the chain instead.

rdar://problem/20275084

Differential Revision: http://reviews.llvm.org/D8849


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234430 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 20:34:53 +00:00
Adam Nemet
cd13a3808a [LoopAccesses] Allow analysis to complete in the presence of uniform stores
(Re-apply r234361 with a fix and a testcase for PR23157)

Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234424 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 17:48:40 +00:00
Scott Douglass
7e2bc24e05 [ARM] make vminnm/vmaxnm work with ?le, ?ge and no-nans-fp-math
Because -menable-no-nans causes fcmp conditions to be rewritten
without 'o' or 'u' the recognition code in needs to cope. Also
extended it to handle 'le' and 'ge.

Differential Revision: http://reviews.llvm.org/D8725


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234421 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 17:18:28 +00:00
Sanjay Patel
912be27c05 fixed to test features, not CPU models
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234413 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 16:51:42 +00:00
Toma Tabacu
739ca842aa [mips] [IAS] Do not generate redundant move when expanding lw/sw with symbol.
Summary:
Even though there is no 2nd register operand in the "lw/sw $8, symbol" case, we still try to find one, 
and we end up with $0, which makes us generate an unnecessary "addu $8, $8, $0" (a.k.a. "move $8, $8").

We can avoid this by checking if the 2nd register operand is different from $0, before generating the addu.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8055

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234406 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 13:52:41 +00:00
Toma Tabacu
f716ca43ca [mips] [IAS] Add support for the BNEZL and BEQZL pseudo-instructions.
Summary:
They are of the form "bnezl/beqzl $rs, offset" and expand to "bnel/beql $rs, $zero, offset".

These instructions are used in Linux inline assembly.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234401 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 12:15:05 +00:00
Rafael Espindola
541279fdf9 Write the section header in the end.
One could make the argument for writing it immediately after the ELF header,
but writing it in the middle of the sections like we were doing just makes
it harder for no reason.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234400 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 11:41:24 +00:00
Sergey Dmitrouk
a7512d1d4a [ARM][Debug Info] Restore emitting of .cfi_def_cfa_offset for functions without stack frame
Summary: Looks like new code from [[ http://reviews.llvm.org/rL222057 | rL222057 ]] doesn't account for early `return` in `ARMFrameLowering::emitPrologue`, which leads to loosing `.cfi_def_cfa_offset` directive for functions without stack frame.

Reviewers: echristo, rengolin, asl, t.p.northover

Reviewed By: t.p.northover

Subscribers: llvm-commits, rengolin, aemerson

Differential Revision: http://reviews.llvm.org/D8606

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234399 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 10:10:12 +00:00
Toma Tabacu
39fedc9aa2 [mips] [IAS] Remove AssemblerPredicate's from RelocPIC and RelocStatic.
Summary:
These AssemblerPredicate's are unnecessary and actually make some instructions unusable when assembling pre-MIPS32 ISAs.
For example, this was causing the IAS to reject the 'j' instruction for MIPS I-V.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8300

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234398 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 10:06:45 +00:00
Adam Nemet
b343d1cd85 Revert "[LoopAccesses] Allow analysis to complete in the presence of uniform stores"
This reverts commit r234361.

It caused PR23157.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234387 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 04:16:55 +00:00
Tom Stellard
a787066317 R600/SI: Initial support for assembler and inline assembly
This is currently considered experimental, but most of the more
commonly used instructions should work.

So far only SI has been extensively tested, CI and VI probably work too,
but may be buggy.  The current set of tests cases do not give complete
coverage, but I think it is sufficient for an experimental assembler.

See the documentation in R600Usage for more information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234381 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 01:09:26 +00:00
Tom Stellard
434e097df8 R600/SI: Don't print offset0/offset1 DS operands when they are 0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234379 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-08 01:09:19 +00:00
Tim Northover
112102c7fe AArch64: disallow "fmov sD, #-0.0" during assembly.
We weren't checking the sign of the floating point immediate before translating
it to "fmov sD, wzr". Similarly for D-regs.

Technically "movi vD.2s, #0x80, lsl #24" would work most of the time, but it's
not a blessed alias (and I don't think it should be since people expect writing
sD to zero out the high lanes, and there's no dD equivalent). So an error it is.

rdar://20455398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 22:49:47 +00:00
Adam Nemet
a0834f1d87 [LoopAccesses] Allow analysis to complete in the presence of uniform stores
Both run-time pointer checking and the dependence analysis are capable
of dealing with uniform addresses. I.e. it's really just an orthogonal
property of the loop that the analysis computes.

Run-time pointer checking will only try to reason about SCEVAddRec
pointers or else gives up. If the uniform pointer turns out the be a
SCEVAddRec in an outer loop, the run-time checks generated will be
correct (start and end bounds would be equal).

In case of the dependence analysis, we work again with SCEVs. When
compared against a loop-dependent address of the same underlying object,
the difference of the two SCEVs won't be constant. This will result in
returning an Unknown dependence for the pair.

When compared against another uniform access, the difference would be
constant and we should return the right type of dependence
(forward/backward/etc).

The changes also adds support to query this property of the loop and
modify the vectorizer to use this.

Patch by Ashutosh Nema!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234361 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 21:46:16 +00:00
Reid Kleckner
1106660066 [WinEH] Fix xdata generation when no catch object is present
The lack of a catch object is indicated by a frame escape index of -1.

Fixes PR23137.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234346 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 19:46:38 +00:00
Toma Tabacu
db3b3a0b9f [mips] [IAS] Allow .set assignments for already defined symbols.
Summary:
This is not possible when using the IAS for MIPS, but it is possible when using the IAS for other architectures and when using GAS for MIPS.


Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8578

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234316 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 13:59:39 +00:00
Rafael Espindola
838c24a7c8 Refactor a lot of duplicated code for stub output.
This also moves it earlier so that it they are produced before we print
an end symbol for the data section.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234315 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 13:42:44 +00:00
Toma Tabacu
0e407e7bbf [TableGen] Prevent invalid code generation when emitting AssemblerPredicate conditions.
Summary:
The loop which emits AssemblerPredicate conditions also links them together by emitting a '&&'.
If the 1st predicate is not an AssemblerPredicate, while the 2nd one is, nothing gets emitted for the 1st one, but we still emit the '&&' because of the 2nd predicate.
This generated code looks like "( && Cond2)" and is invalid.

Reviewers: dsanders

Reviewed By: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D8294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234312 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 12:10:11 +00:00
Daniel Jasper
ee6f78817a Add test showing that MachineLICM is calculating register pressure wrong
More details: http://llvm.org/PR23143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-07 11:41:40 +00:00
Rafael Espindola
7bdc1cb690 Use sext in fast isel.
Fast isel used to zero extends immediates to 64 bits. This normally goes
unnoticed because the value is truncated to 32 bits for output.

Two cases were it is noticed:

* We fail to use smaller encodings.
* If the original constant was smaller than i32.

In the tests using i1 constants, codegen would change to use -1, which is fine
(and matches what regular isel does) since only the lowest bit is then used.

Instead, this patch then changes the ir to use i8 constants, which looks more
like what clang produces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234249 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 22:29:07 +00:00
Reid Kleckner
ebb3c53316 [WinEH] Don't sink allocas into child handlers
The uselist isn't enough to infer anything about the lifetime of such
allocas. If we want to re-add this optimization, we will need to
leverage lifetime markers to do it.

Fixes PR23122.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234196 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:50:38 +00:00
Tim Northover
8af3f965e0 ARM: do not relax Thumb1 -> Thumb2 if only Thumb1 is available.
After recognising that a certain narrow instruction might need a relocation to
be represented, we used to unconditionally relax it to a Thumb2 instruction to
permit this. Unfortunately, some CPUs (e.g. v6m) don't even have most Thumb2
instructions, so we end up emitting a completely invalid instruction.

Theoretically, ELF does have relocations for these situations; but they are
fairly unusable with such short ranges and the ABI document even says they're
documented "for completeness". So an error is probably better there too.

rdar://20391953

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234195 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:44:42 +00:00
Simon Pilgrim
2ec7242600 [X86][SSE] Use (V)PINSRB for direct byte insertion in 16i8 buildvector on SSE4.1 targets
This patch allows SSE4.1 targets to use (V)PINSRB to create 16i8 vectors by inserting i8 scalars directly into a XMM register instead of merging pairs of i8 scalars into a i16 and using the SSE2 PINSRW instruction.

This allows folding of byte loads and reduces scalar register usage as well.

Differential Revision: http://reviews.llvm.org/D8839

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234193 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:39:00 +00:00
Kevin Enderby
000ffacf53 Fix failure on builder llvm-clang-lld-x86_64-debian-fast as the
test macho-objc-meta-data.test had a line it shouldn't have had.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234190 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 18:18:23 +00:00
Kevin Enderby
2e8b39e549 For llvm-objdump added support for printing Objc2 32-bit runtime meta data
with the existing -objc-meta-data and -macho options for Mach-O files.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234185 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 17:47:03 +00:00
Jingyue Wu
10483b93ea [SLSR] consider &B[S << i] as &B[(1 << i) * S]
Summary: This reduces handling &B[(1 << i) * s] to handling &B[i * S].

Test Plan: slsr-gep.ll

Reviewers: meheff

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D8837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234180 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 17:15:48 +00:00
Simon Pilgrim
7d424d47d3 [DAGCombiner] Add support for FCEIL, FFLOOR and FTRUNC vector constant folding
Differential Revision: http://reviews.llvm.org/D8715

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234179 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 17:15:41 +00:00
Duncan P. N. Exon Smith
fe42afc0e5 Verifier: Check composite type template params
Add missing checks for `templateParams:` in `MDCompositeType`.  Pull the
current check for `MDSubprogram` to reduce duplicated code and fix it up
to print a good message when the immediate operand isn't an `MDTuple`
(as a drive-by, make the same fix to `variables:` in `MDSubprogram`).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234177 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 17:04:58 +00:00
Rafael Espindola
6ba6e554c7 Use a comma after the unique keyword.
H.J. Lu noted that all .section options are separated by a comma.

This patch changes the syntax of unique to require one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234174 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 16:34:41 +00:00
Rafael Espindola
9428f184be Be consistent when deciding if a relocation is needed.
Before when deciding if we needed a relocation in A-B, we wore only checking
if A was weak.

This fixes the asymmetry.

The "InSet" argument should probably be renamed to "ForValue", since InSet is
very MachO specific, but doing so in this patch would make it hard to read.

This fixes PR22815.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 15:27:57 +00:00
Rafael Espindola
83be4429b2 Store the sh_link of ARM_EXIDX directly in MCSectionELF.
This avoids some pretty horrible and broken name based section handling.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234142 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-06 04:25:18 +00:00
Rafael Espindola
903f4a2051 Implement unique sections with an unique ID.
This allows the compiler/assembly programmer to switch back to a
section. This in turn fixes the bootstrap failure on powerpc (tested
on gcc110) without changing the ppc codegen at all.

I will try to cleanup the various getELFSection overloads in a  followup patch.
Just using a default argument now would lead to ambiguities.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234099 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 18:02:01 +00:00
Simon Pilgrim
b8d7733666 [DAGCombiner] Canonicalize vector constants for ADD/MUL/AND/OR/XOR re-association
Scalar integers are commuted to move constants to the RHS for re-association - this ensures vectors do the same.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234092 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 10:20:31 +00:00
Eric Christopher
8bc9aa92aa Strip trailing whitespace and reword explanatory comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234078 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 02:26:47 +00:00
Craig Topper
b1ff87ec86 [X86] Don't use GR64 register 'and with immediate' instructions if the immediate is zero in the upper 33-bits or upper 57-bits. Use GR32 instructions instead.
Previously the patterns didn't have high enough priority and we would only use the GR32 form if the only the upper 32 or 56 bits were zero.

Fixes PR23100.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-04 02:08:20 +00:00
David Majnemer
45221a75fb [WinEH] Fill out CatchHigh in the TryBlockMap
Now all fields in the WinEH xdata have been filled out.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234067 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 23:37:34 +00:00
David Majnemer
8431785c5e [WinEH] Fill out .xdata for catch objects
This add support for catching an exception such that an exception object
available to the catch handler will be initialized by the runtime.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234062 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 22:49:05 +00:00
David Majnemer
f89ce9a09d [WinEH] Sink UnwindHelp completely out of IR
We don't need to represent UnwindHelp in IR.  Instead, we can use the
knowledge that we are emitting the parent function to decide if we
should create the UnwindHelp stack object.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234061 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 22:32:26 +00:00
David Majnemer
33b9ae320a Fix a typo
CHECK-LABEL had the wrong function name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234051 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 20:56:24 +00:00
David Majnemer
701c2fca7e [InstCombine] Use DataLayout to determine vector element width
InstCombine didn't realize that it needs to use DataLayout to determine
how wide pointers are.  This lead to assertion failures.

This fixes PR23113.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234046 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 20:18:40 +00:00
Andrew Kaylor
675e22e7ad [WinEH] Handle nested landing pads in outlined catch handlers
Differential Revision: http://reviews.llvm.org/D8596



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234041 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 19:37:50 +00:00
Sanjay Patel
0b654728bc use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234029 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:17:50 +00:00
Sanjay Patel
c070a8f6ba use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234027 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:13:31 +00:00
Sanjay Patel
a3ba15a9c9 use update_llc_test_checks.py to tighten checking; remove unnecessary testing params
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:09:37 +00:00
Sanjay Patel
5703d9f0b8 use update_llc_test_checks.py to tighten checking
remove redundant and unnecessary test parameters


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234022 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 17:02:48 +00:00
Duncan P. N. Exon Smith
b0d0a65e1d Verifier: Check that inlined-at locations agree
Check that the `MDLocalVariable::getInlinedAt()` in a debug info
intrinsic's variable always matches the `MDLocation::getInlinedAt()` of
its `!dbg` attachment.

The goal here is to get rid of `MDLocalVariable::getInlinedAt()`
entirely (PR22778), since it's expensive and unnecessary, but I'll let
this verifier check bake for a while (a week maybe?) first.  I've
updated the testcases that had the wrong value for `inlinedAt:`.

This checks that things are sane in the IR, but currently things go out
of whack in a few places in the backend.  I'll follow shortly with
assertions in the backend (with code fixes).

If you have out-of-tree testcases that just started failing, here's how
I updated these ones:

 1. The verifier check gives you the basic block, function, instruction,
    and relevant metadata arguments (metadata numbering doesn't
    necessarily match the source file, unfortunately).
 2. Look at the `@llvm.dbg.*()` instruction, and compare the
    `inlinedAt:` fields of the variable argument (second `metadata`
    argument) and the `!dbg` attachment.
 3. Figure out based on the variable `scope:` chain and the functions in
    the file whether the variable has been inlined (and into what), so
    you can determine which `inlinedAt:` is actually correct.  In all of
    the in-tree testcases, the `!MDLocation()` was correct and the
    `!MDLocalVariable()` was wrong, but YMMV.
 4. Duplicate the metadata that you're going to change, and add/drop the
    `inlinedAt:` field from one of them.  Be careful that the other
    references to the same metadata node point at the correct one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234021 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:54:30 +00:00
Sanjay Patel
f58fa53c99 add checks; remove redundant testing parameters
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234020 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:44:42 +00:00
Duncan P. N. Exon Smith
2d2520986e CodeGen: Fix MachineInstr::print() for DBG_VALUE
Grab the `MDLocalVariable` from the second-to-last argument; the last
argument is an `MDExpression`, and mixing them up will crash.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234019 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:23:04 +00:00
Sanjay Patel
8fea4abae0 use update_llc_test_checks.py to tighten checking; remove darwin and sandybridge overspecification
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234017 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 16:06:58 +00:00
Simon Pilgrim
af519fbf42 Added vector tests for DAGCombiner::ReassociateOps
Missing vector tests for rL233482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234015 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 15:04:46 +00:00
Simon Pilgrim
be149a8148 [X86] Added SSE4.2 CRC32 memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234013 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 14:24:40 +00:00
Bill Schmidt
1c4b6de5ed [PowerPC] Enable splat generation for BUILD_VECTOR with little endian
When enabling PPC64LE, I disabled some optimizations of BUILD_VECTOR
nodes for little endian because wrong results were produced.  I've
subsequently investigated and found this is due to a call to
BuildVectorSDNode::isConstantSplat that was always specifying
big-endian.  With this changed to correctly identify the target
endianness, the optimizations work as expected.

I found another case of a call to the same method with big-endian
hardcoded, in PPC::isAllNegativeZeroVector().  I discovered this was
an orphaned method with no callers, so I've just removed it.

The existing test/CodeGen/PowerPC/vec_constants.ll checks these
optimizations, so for testing I've just added a variant for little
endian.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234011 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 13:48:24 +00:00
Simon Pilgrim
e5ecd32488 [X86][3DNow] Added 3DNow! memory folding patterns + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234008 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 11:50:30 +00:00
Simon Pilgrim
cbe57104de [X86][MMX] Added MMX stack folding tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234006 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 11:01:15 +00:00
Simon Pilgrim
4e60da755a [DAGCombiner] Combine shuffles of BUILD_VECTOR and SCALAR_TO_VECTOR
This patch attempts to fold the shuffling of 'scalar source' inputs - BUILD_VECTOR and SCALAR_TO_VECTOR nodes - if the shuffle node is the only user. This folds away a lot of unnecessary shuffle nodes, and allows quite a bit of constant folding that was being missed.

Differential Revision: http://reviews.llvm.org/D8516

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 10:02:21 +00:00
Peter Collingbourne
c39f5dd0e2 MC: For variable symbols, maintain MCSymbol::Section as a cache.
Fixes PR19582.

Previously, when an asm assignment (.set or =) was created, we would look up
the section immediately in MCSymbol::setVariableValue. This caused symbols
to receive the wrong section if the RHS of the assignment had not been seen
yet. This had a knock-on effect in the object file emitters, causing them
to emit extra symbols, or to give symbols the wrong visibility or the wrong
section. For example, in the following asm:

.data
.Llocal:

.text
leaq .Llocal1(%rip), %rdi
.Llocal1 = .Llocal2
.Llocal2 = .Llocal

the first assignment would give .Llocal1 a null section, which would never get
fixed up by the second assignment. This would cause the ELF object file emitter
to consider .Llocal1 to be an undefined symbol and give it external linkage,
even though .Llocal1 should not have been emitted at all in the object file.

Or in the following asm:

alias_to_local = Ltmp0
Ltmp0:

the Mach-O object file emitter would give the alias_to_local symbol a n_type
of N_SECT and a n_sect of 0.  This is invalid under the Mach-O specification,
which requires N_SECT symbols to receive a non-zero section number if the
symbol is defined in a section in the object file.

https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachORuntime/#//apple_ref/c/tag/nlist

After this change we do not look up the section when the assignment is created,
but instead look it up on demand and store it in Section, which is treated
as a cache if the symbol is a variable symbol.

This change also fixes a bug in MCExpr::FindAssociatedSection. Previously,
if we saw a subtraction, we would return the first referenced section, even in
cases where we should have been returning the absolute pseudo-section. Now we
always return the absolute pseudo-section for expressions that subtract two
section-derived expressions. This isn't always correct (e.g. if one of the
sections ends up being laid out at an absolute address), but it's probably
the best we can do without more context.

This allows us to remove code in two places where we appear to have been
working around this bug, in MachObjectWriter::markAbsoluteVariableSymbols
and in X86AsmPrinter::EmitStartOfAsmFile.

Re-applies r233595 (aka D8586), which was reverted in r233898.

Differential Revision: http://reviews.llvm.org/D8798

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 01:46:11 +00:00
Matthias Braun
67a2b82b14 ARM: Handle physreg targets in RegPair hints gracefully
Register coalescing can change the target of a RegPair hint to a
physreg, we should not crash on this. This also slightly improved the
way ARMBaseRegisterInfo::updateRegAllocHint() works.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233987 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-03 00:18:38 +00:00
Reid Kleckner
0f8f086e21 [ASan] Don't use stack malloc for 32-bit functions using inline asm
This prevents us from running out of registers in the backend.

Introducing stack malloc calls prevents the backend from recognizing the
inline asm operands as stack objects. When the backend recognizes a
stack object, it doesn't need to materialize the address of the memory
in a physical register. Instead it generates a simple SP-based memory
operand. Introducing a stack malloc forces the backend to find a free
register for every memory operand. 32-bit x86 simply doesn't have enough
registers for this to succeed in most cases.

Reviewers: kcc, samsonov

Differential Revision: http://reviews.llvm.org/D8790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233979 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 21:44:55 +00:00
Jingyue Wu
6dedfc073a [SLSR] handles off bounds GEPs
Summary:
The old requirement on GEP candidates being in bounds is unnecessary.
For off-bound GEPs, we still have

  &B[i * S] = B + (i * S) * e = B + (i * e) * S

Test Plan: slsr_offbound_gep in slsr-gep.ll

Reviewers: meheff

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8809

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233949 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 21:18:32 +00:00
Reid Kleckner
df4fd4fc4e [WinEH] Make llvm.eh.actions use frameescape indices for catch params
This makes it possible to use the same representation of llvm.eh.actions
in outlined handlers as we use in the parent function because i32's are
just constants that can be copied freely between functions.

I had to add a sentinel alloca to the list of child allocas so that we
don't try to sink the catch object into the handler. Normally, one would
use nullptr for this kind of thing, but TinyPtrVector doesn't support
null elements. More than that, it's elements have to have a suitable
alignment. Therefore, I settled on this for my sentinel:

  AllocaInst *getCatchObjectSentinel() {
    return static_cast<AllocaInst *>(nullptr) + 1;
  }

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233947 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 21:13:31 +00:00
Sanjay Patel
5b93ab6cde [AVX] Improve insertion of i8 or i16 into low element of 256-bit zero vector
Without this patch, we split the 256-bit vector into halves and produced something like:
	movzwl	(%rdi), %eax
	vmovd	%eax, %xmm0
	vxorps	%xmm1, %xmm1, %xmm1
	vblendps	$15, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0,1,2,3],ymm1[4,5,6,7]

Now, we eliminate the xor and blend because those zeros are free with the vmovd:
        movzwl  (%rdi), %eax
        vmovd   %eax, %xmm0

This should be the final fix needed to resolve PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233941 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 20:21:52 +00:00
Sanjay Patel
8765e82c83 [X86, AVX] adjust tablegen patterns to generate better code for scalar insertion into zero vector (PR23073)
For code like this:

define <8 x i32> @load_v8i32() {
  ret <8 x i32> <i32 7, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0>
}

We produce this AVX code:

_load_v8i32:                            ## @load_v8i32
  movl	$7, %eax
  vmovd	%eax, %xmm0
  vxorps	%ymm1, %ymm1, %ymm1
  vblendps	$1, %ymm0, %ymm1, %ymm0 ## ymm0 = ymm0[0],ymm1[1,2,3,4,5,6,7]
  retq

There are at least 2 bugs in play here:

    We're generating a blend when a move scalar does the same job using 2 less instruction bytes (see FIXMEs).
    We're not matching an existing pattern that would eliminate the xor and blend entirely. The zero bytes are free with vmovd.

The 2nd fix involves an adjustment of "AddedComplexity" [1] and mostly masks the 1st problem.

[1] AddedComplexity has close to no documentation in the source. 
The best we have is this comment: "roughly corresponds to the number of nodes that are covered". 
It appears that x86 has bastardized this definition by inflating its values for some other
undocumented reason. For example, we have a pattern with "AddedComplexity = 400" (!). 

I searched my way to this page:
https://groups.google.com/forum/#!topic/llvm-dev/5UX-Og9M0xQ

Differential Revision: http://reviews.llvm.org/D8794



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233931 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 17:56:17 +00:00
Elena Demikhovsky
4eb165220f AVX-512: intrinsics for VPADD, VPMULDQ and VPSUB
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233906 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:51:40 +00:00
Vasileios Kalintiris
cb64898a22 [mips] Make sure that we don't adjust the stack pointer by zero amount.
Reviewers: dsanders

Reviewed By: dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233904 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 10:14:54 +00:00
Peter Collingbourne
a8432640e8 Revert r233595, "MC: For variable symbols, maintain MCSymbol::Section as a cache."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233898 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 07:02:51 +00:00
Lang Hames
2eb21d0245 [Orc] Fix local-linkage handling in the CompileOnDemand layer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233895 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 05:28:10 +00:00
Philip Reames
c47a3ae7d5 Teach gcroot how to handle dynamically realigned frames
I'm playing with supporting custom stack map formats with statepoints.  While 
doing so, I noticed that the existing implementation didn't indicate inherently 
unsized frames.  This change essentially just ports the functionality that already 
exists for the default StackMaps section to custom stackmaps.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233891 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 05:00:40 +00:00
Adam Nemet
1c43b926f6 [LoopAccesses] Remove unused global variables in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233887 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 04:42:51 +00:00
Lang Hames
14ef491582 [Orc] Add support classes for inspecting and running C++ static ctor/dtors, and
use these to add support for C++ static ctors/dtors to the Orc-lazy JIT in LLI.

Replace the trivial_retval_1 regression test - the new 'hello' test is covering
strictly more code. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233885 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-02 04:34:45 +00:00
Kevin Enderby
95d8155b37 Add the option -objc-meta-data to llvm-objdump used with -macho to
print the Objective-C runtime meta data for Mach-O files.

There are three types of Objective-C runtime meta data, Objc2 64-bit,
Objc2 32-bit and Objc1 32-bit.  This prints the first of these types. The
changes to print the others will follow next.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233840 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 20:57:01 +00:00
Benjamin Kramer
a066ed09db [X86] Don't accidentally select shll $1, %eax when shrinking an immediate.
addl has higher throughput and this was needlessly picking a suboptimal
encoding causing PR23098.

I wish there was a way of doing this without further duplicating tbl-
generated patterns, but so far I haven't found one.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233832 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 19:01:09 +00:00
Sanjoy Das
5436372079 [SCEV] Look at backedge dominating conditions (re-land r233447).
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch.  We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.

This re-lands r233447.  r233447 was reverted because it caused massive
compile-time regressions.  This change has a fix for the same issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233829 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 18:24:06 +00:00
Diego Novillo
32d9020423 Remove 4,096 loop scale limitation.
Summary:
This is part 1 of fixes to address the problems described in
https://llvm.org/bugs/show_bug.cgi?id=22719.

The restriction to limit loop scales to 4,096 does not really prevent
overflows anymore, as the underlying algorithm has changed and does
not seem to suffer from this problem.

Additionally, artificially restricting loop scales to such a low number
skews frequency information, making loops of equal hotness appear to
have very different hotness properties.

The only loops that are artificially restricted to a scale of 4096 are
infinite loops (those loops with an exit mass of 0). This prevents
infinite loops from skewing the frequencies of other regions in the CFG.

At the end of propagation, frequencies are scaled to values that take no
more than 64 bits to represent. When the range of frequencies to be
represented fits within 61 bits, it pushes up the scaling factor to a
minimum of 8 to better distinguish small frequency values. Otherwise,
small frequency values are all saturated down at 1.

Tested on x86_64.

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8718

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233826 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 17:42:27 +00:00
Andrew Kaylor
6b2fe99659 Fix WinEHPrepare bug with multiple catch handlers
Differential Revision: http://reviews.llvm.org/D8682



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233824 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 17:21:25 +00:00
Vladimir Sukharev
0751793310 [ARM] Rename v8.1a from "extension" to "architecture"
v8.1a is renamed to architecture, following current entity naming approach.

Excess generic cpu is removed. Intended use: "generic" cpu with "v8.1a" subtarget feature

Reviewers: jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8767


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233811 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 14:54:56 +00:00
Ulrich Weigand
adf55a5a57 [SystemZ] Support transactional execution on zEC12
The zEC12 provides the transactional-execution facility.  This is exposed
to users via a set of builtin routines on other compilers.  This patch
adds LLVM support to enable those builtins.  In partciular, the patch:

- adds the transactional-execution and processor-assist facilities
- adds MC support for all instructions provided by those facilities
- adds LLVM intrinsics for those instructions and hooks them up for CodeGen
- adds CodeGen support to optimize CC return value checking

Since this is first use of target-specific intrinsics on the platform,
the patch creates the include/llvm/IR/IntrinsicsSystemZ.td file and
hooks it up in Intrinsics.td.  I've also changed Triple::getArchTypePrefix
to return "s390" instead of "systemz", since the naming convention for
GCC intrinsics uses "s390" on the platform, and it neemed more straight-
forward to use the same convention for LLVM IR intrinsics.

An associated clang patch makes the intrinsics (and command line switches)
available at the source-language level.

For reference, the transactional-execution instructions are documented
in the z/Architecture Principles of Operation for the zEC12:
http://publibfp.boulder.ibm.com/cgi-bin/bookmgr/download/DZ9ZR009.pdf
The associated builtins are documented in the GCC manual:
http://gcc.gnu.org/onlinedocs/gcc/S_002f390-System-z-Built-in-Functions.html


Index: llvm-head/lib/Target/SystemZ/SystemZOperators.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZOperators.td
+++ llvm-head/lib/Target/SystemZ/SystemZOperators.td
@@ -79,6 +79,9 @@ def SDT_ZI32Intrinsic       : SDTypeProf
 def SDT_ZPrefetch           : SDTypeProfile<0, 2,
                                             [SDTCisVT<0, i32>,
                                              SDTCisPtrTy<1>]>;
+def SDT_ZTBegin             : SDTypeProfile<0, 2,
+                                            [SDTCisPtrTy<0>,
+                                             SDTCisVT<1, i32>]>;
 
 //===----------------------------------------------------------------------===//
 // Node definitions
@@ -180,6 +183,15 @@ def z_prefetch          : SDNode<"System
                                  [SDNPHasChain, SDNPMayLoad, SDNPMayStore,
                                   SDNPMemOperand]>;
 
+def z_tbegin            : SDNode<"SystemZISD::TBEGIN", SDT_ZTBegin,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPMayStore,
+                                  SDNPSideEffect]>;
+def z_tbegin_nofloat    : SDNode<"SystemZISD::TBEGIN_NOFLOAT", SDT_ZTBegin,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPMayStore,
+                                  SDNPSideEffect]>;
+def z_tend              : SDNode<"SystemZISD::TEND", SDTNone,
+                                 [SDNPHasChain, SDNPOutGlue, SDNPSideEffect]>;
+
 //===----------------------------------------------------------------------===//
 // Pattern fragments
 //===----------------------------------------------------------------------===//
Index: llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZInstrFormats.td
+++ llvm-head/lib/Target/SystemZ/SystemZInstrFormats.td
@@ -473,6 +473,17 @@ class InstSS<bits<8> op, dag outs, dag i
   let Inst{15-0}  = BD2;
 }
 
+class InstS<bits<16> op, dag outs, dag ins, string asmstr, list<dag> pattern>
+  : InstSystemZ<4, outs, ins, asmstr, pattern> {
+  field bits<32> Inst;
+  field bits<32> SoftFail = 0;
+
+  bits<16> BD2;
+
+  let Inst{31-16} = op;
+  let Inst{15-0}  = BD2;
+}
+
 //===----------------------------------------------------------------------===//
 // Instruction definitions with semantics
 //===----------------------------------------------------------------------===//
Index: llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZInstrInfo.td
+++ llvm-head/lib/Target/SystemZ/SystemZInstrInfo.td
@@ -1362,6 +1362,60 @@ let Defs = [CC] in {
 }
 
 //===----------------------------------------------------------------------===//
+// Transactional execution
+//===----------------------------------------------------------------------===//
+
+let Predicates = [FeatureTransactionalExecution] in {
+  // Transaction Begin
+  let hasSideEffects = 1, mayStore = 1,
+      usesCustomInserter = 1, Defs = [CC] in {
+    def TBEGIN : InstSIL<0xE560,
+                         (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                         "tbegin\t$BD1, $I2",
+                         [(z_tbegin bdaddr12only:$BD1, imm32zx16:$I2)]>;
+    def TBEGIN_nofloat : Pseudo<(outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                                [(z_tbegin_nofloat bdaddr12only:$BD1,
+                                                   imm32zx16:$I2)]>;
+    def TBEGINC : InstSIL<0xE561,
+                          (outs), (ins bdaddr12only:$BD1, imm32zx16:$I2),
+                          "tbeginc\t$BD1, $I2",
+                          [(int_s390_tbeginc bdaddr12only:$BD1,
+                                             imm32zx16:$I2)]>;
+  }
+
+  // Transaction End
+  let hasSideEffects = 1, Defs = [CC], BD2 = 0 in
+    def TEND : InstS<0xB2F8, (outs), (ins), "tend", [(z_tend)]>;
+
+  // Transaction Abort
+  let hasSideEffects = 1, isTerminator = 1, isBarrier = 1 in
+    def TABORT : InstS<0xB2FC, (outs), (ins bdaddr12only:$BD2),
+                       "tabort\t$BD2",
+                       [(int_s390_tabort bdaddr12only:$BD2)]>;
+
+  // Nontransactional Store
+  let hasSideEffects = 1 in
+    def NTSTG : StoreRXY<"ntstg", 0xE325, int_s390_ntstg, GR64, 8>;
+
+  // Extract Transaction Nesting Depth
+  let hasSideEffects = 1 in
+    def ETND : InherentRRE<"etnd", 0xB2EC, GR32, (int_s390_etnd)>;
+}
+
+//===----------------------------------------------------------------------===//
+// Processor assist
+//===----------------------------------------------------------------------===//
+
+let Predicates = [FeatureProcessorAssist] in {
+  let hasSideEffects = 1, R4 = 0 in
+    def PPA : InstRRF<0xB2E8, (outs), (ins GR64:$R1, GR64:$R2, imm32zx4:$R3),
+                      "ppa\t$R1, $R2, $R3", []>;
+  def : Pat<(int_s390_ppa_txassist GR32:$src),
+            (PPA (INSERT_SUBREG (i64 (IMPLICIT_DEF)), GR32:$src, subreg_l32),
+                 0, 1)>;
+}
+
+//===----------------------------------------------------------------------===//
 // Miscellaneous Instructions.
 //===----------------------------------------------------------------------===//
 
Index: llvm-head/lib/Target/SystemZ/SystemZProcessors.td
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZProcessors.td
+++ llvm-head/lib/Target/SystemZ/SystemZProcessors.td
@@ -60,6 +60,16 @@ def FeatureMiscellaneousExtensions : Sys
   "Assume that the miscellaneous-extensions facility is installed"
 >;
 
+def FeatureTransactionalExecution : SystemZFeature<
+  "transactional-execution", "TransactionalExecution",
+  "Assume that the transactional-execution facility is installed"
+>;
+
+def FeatureProcessorAssist : SystemZFeature<
+  "processor-assist", "ProcessorAssist",
+  "Assume that the processor-assist facility is installed"
+>;
+
 def : Processor<"generic", NoItineraries, []>;
 def : Processor<"z10", NoItineraries, []>;
 def : Processor<"z196", NoItineraries,
@@ -70,4 +80,5 @@ def : Processor<"zEC12", NoItineraries,
                 [FeatureDistinctOps, FeatureLoadStoreOnCond, FeatureHighWord,
                  FeatureFPExtension, FeaturePopulationCount,
                  FeatureFastSerialization, FeatureInterlockedAccess1,
-                 FeatureMiscellaneousExtensions]>;
+                 FeatureMiscellaneousExtensions,
+                 FeatureTransactionalExecution, FeatureProcessorAssist]>;
Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.cpp
+++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.cpp
@@ -40,6 +40,7 @@ SystemZSubtarget::SystemZSubtarget(const
       HasLoadStoreOnCond(false), HasHighWord(false), HasFPExtension(false),
       HasPopulationCount(false), HasFastSerialization(false),
       HasInterlockedAccess1(false), HasMiscellaneousExtensions(false),
+      HasTransactionalExecution(false), HasProcessorAssist(false),
       TargetTriple(TT), InstrInfo(initializeSubtargetDependencies(CPU, FS)),
       TLInfo(TM, *this), TSInfo(*TM.getDataLayout()), FrameLowering() {}
 
Index: llvm-head/lib/Target/SystemZ/SystemZSubtarget.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZSubtarget.h
+++ llvm-head/lib/Target/SystemZ/SystemZSubtarget.h
@@ -42,6 +42,8 @@ protected:
   bool HasFastSerialization;
   bool HasInterlockedAccess1;
   bool HasMiscellaneousExtensions;
+  bool HasTransactionalExecution;
+  bool HasProcessorAssist;
 
 private:
   Triple TargetTriple;
@@ -102,6 +104,12 @@ public:
     return HasMiscellaneousExtensions;
   }
 
+  // Return true if the target has the transactional-execution facility.
+  bool hasTransactionalExecution() const { return HasTransactionalExecution; }
+
+  // Return true if the target has the processor-assist facility.
+  bool hasProcessorAssist() const { return HasProcessorAssist; }
+
   // Return true if GV can be accessed using LARL for reloc model RM
   // and code model CM.
   bool isPC32DBLSymbol(const GlobalValue *GV, Reloc::Model RM,
Index: llvm-head/lib/Support/Triple.cpp
===================================================================
--- llvm-head.orig/lib/Support/Triple.cpp
+++ llvm-head/lib/Support/Triple.cpp
@@ -92,7 +92,7 @@ const char *Triple::getArchTypePrefix(Ar
   case sparcv9:
   case sparc:       return "sparc";
 
-  case systemz:     return "systemz";
+  case systemz:     return "s390";
 
   case x86:
   case x86_64:      return "x86";
Index: llvm-head/include/llvm/IR/Intrinsics.td
===================================================================
--- llvm-head.orig/include/llvm/IR/Intrinsics.td
+++ llvm-head/include/llvm/IR/Intrinsics.td
@@ -634,3 +634,4 @@ include "llvm/IR/IntrinsicsNVVM.td"
 include "llvm/IR/IntrinsicsMips.td"
 include "llvm/IR/IntrinsicsR600.td"
 include "llvm/IR/IntrinsicsBPF.td"
+include "llvm/IR/IntrinsicsSystemZ.td"
Index: llvm-head/include/llvm/IR/IntrinsicsSystemZ.td
===================================================================
--- /dev/null
+++ llvm-head/include/llvm/IR/IntrinsicsSystemZ.td
@@ -0,0 +1,46 @@
+//===- IntrinsicsSystemZ.td - Defines SystemZ intrinsics ---*- tablegen -*-===//
+//
+//                     The LLVM Compiler Infrastructure
+//
+// This file is distributed under the University of Illinois Open Source
+// License. See LICENSE.TXT for details.
+//
+//===----------------------------------------------------------------------===//
+//
+// This file defines all of the SystemZ-specific intrinsics.
+//
+//===----------------------------------------------------------------------===//
+
+//===----------------------------------------------------------------------===//
+//
+// Transactional-execution intrinsics
+//
+//===----------------------------------------------------------------------===//
+
+let TargetPrefix = "s390" in {
+  def int_s390_tbegin : Intrinsic<[llvm_i32_ty], [llvm_ptr_ty, llvm_i32_ty],
+                                  [IntrNoDuplicate]>;
+
+  def int_s390_tbegin_nofloat : Intrinsic<[llvm_i32_ty],
+                                          [llvm_ptr_ty, llvm_i32_ty],
+                                          [IntrNoDuplicate]>;
+
+  def int_s390_tbeginc : Intrinsic<[], [llvm_ptr_ty, llvm_i32_ty],
+                                   [IntrNoDuplicate]>;
+
+  def int_s390_tabort : Intrinsic<[], [llvm_i64_ty],
+                                  [IntrNoReturn, Throws]>;
+
+  def int_s390_tend : GCCBuiltin<"__builtin_tend">,
+                      Intrinsic<[llvm_i32_ty], []>;
+
+  def int_s390_etnd : GCCBuiltin<"__builtin_tx_nesting_depth">,
+                      Intrinsic<[llvm_i32_ty], [], [IntrNoMem]>;
+
+  def int_s390_ntstg : Intrinsic<[], [llvm_i64_ty, llvm_ptr64_ty],
+                                 [IntrReadWriteArgMem]>;
+
+  def int_s390_ppa_txassist : GCCBuiltin<"__builtin_tx_assist">,
+                              Intrinsic<[], [llvm_i32_ty]>;
+}
+
Index: llvm-head/lib/Target/SystemZ/SystemZ.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZ.h
+++ llvm-head/lib/Target/SystemZ/SystemZ.h
@@ -68,6 +68,18 @@ const unsigned CCMASK_TM_MSB_0       = C
 const unsigned CCMASK_TM_MSB_1       = CCMASK_2 | CCMASK_3;
 const unsigned CCMASK_TM             = CCMASK_ANY;
 
+// Condition-code mask assignments for TRANSACTION_BEGIN.
+const unsigned CCMASK_TBEGIN_STARTED       = CCMASK_0;
+const unsigned CCMASK_TBEGIN_INDETERMINATE = CCMASK_1;
+const unsigned CCMASK_TBEGIN_TRANSIENT     = CCMASK_2;
+const unsigned CCMASK_TBEGIN_PERSISTENT    = CCMASK_3;
+const unsigned CCMASK_TBEGIN               = CCMASK_ANY;
+
+// Condition-code mask assignments for TRANSACTION_END.
+const unsigned CCMASK_TEND_TX   = CCMASK_0;
+const unsigned CCMASK_TEND_NOTX = CCMASK_2;
+const unsigned CCMASK_TEND      = CCMASK_TEND_TX | CCMASK_TEND_NOTX;
+
 // The position of the low CC bit in an IPM result.
 const unsigned IPM_CC = 28;
 
Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.h
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.h
+++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.h
@@ -146,6 +146,15 @@ enum {
   // Perform a serialization operation.  (BCR 15,0 or BCR 14,0.)
   SERIALIZE,
 
+  // Transaction begin.  The first operand is the chain, the second
+  // the TDB pointer, and the third the immediate control field.
+  // Returns chain and glue.
+  TBEGIN,
+  TBEGIN_NOFLOAT,
+
+  // Transaction end.  Just the chain operand.  Returns chain and glue.
+  TEND,
+
   // Wrappers around the inner loop of an 8- or 16-bit ATOMIC_SWAP or
   // ATOMIC_LOAD_<op>.
   //
@@ -318,6 +327,7 @@ private:
   SDValue lowerSTACKSAVE(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerSTACKRESTORE(SDValue Op, SelectionDAG &DAG) const;
   SDValue lowerPREFETCH(SDValue Op, SelectionDAG &DAG) const;
+  SDValue lowerINTRINSIC_W_CHAIN(SDValue Op, SelectionDAG &DAG) const;
 
   // If the last instruction before MBBI in MBB was some form of COMPARE,
   // try to replace it with a COMPARE AND BRANCH just before MBBI.
@@ -355,6 +365,10 @@ private:
   MachineBasicBlock *emitStringWrapper(MachineInstr *MI,
                                        MachineBasicBlock *BB,
                                        unsigned Opcode) const;
+  MachineBasicBlock *emitTransactionBegin(MachineInstr *MI,
+                                          MachineBasicBlock *MBB,
+                                          unsigned Opcode,
+                                          bool NoFloat) const;
 };
 } // end namespace llvm
 
Index: llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp
===================================================================
--- llvm-head.orig/lib/Target/SystemZ/SystemZISelLowering.cpp
+++ llvm-head/lib/Target/SystemZ/SystemZISelLowering.cpp
@@ -20,6 +20,7 @@
 #include "llvm/CodeGen/MachineInstrBuilder.h"
 #include "llvm/CodeGen/MachineRegisterInfo.h"
 #include "llvm/CodeGen/TargetLoweringObjectFileImpl.h"
+#include "llvm/IR/Intrinsics.h"
 #include <cctype>
 
 using namespace llvm;
@@ -304,6 +305,9 @@ SystemZTargetLowering::SystemZTargetLowe
   // Codes for which we want to perform some z-specific combinations.
   setTargetDAGCombine(ISD::SIGN_EXTEND);
 
+  // Handle intrinsics.
+  setOperationAction(ISD::INTRINSIC_W_CHAIN, MVT::Other, Custom);
+
   // We want to use MVC in preference to even a single load/store pair.
   MaxStoresPerMemcpy = 0;
   MaxStoresPerMemcpyOptSize = 0;
@@ -1031,6 +1035,53 @@ prepareVolatileOrAtomicLoad(SDValue Chai
   return DAG.getNode(SystemZISD::SERIALIZE, DL, MVT::Other, Chain);
 }
 
+// Return true if Op is an intrinsic node with chain that returns the CC value
+// as its only (other) argument.  Provide the associated SystemZISD opcode and
+// the mask of valid CC values if so.
+static bool isIntrinsicWithCCAndChain(SDValue Op, unsigned &Opcode,
+                                      unsigned &CCValid) {
+  unsigned Id = cast<ConstantSDNode>(Op.getOperand(1))->getZExtValue();
+  switch (Id) {
+  case Intrinsic::s390_tbegin:
+    Opcode = SystemZISD::TBEGIN;
+    CCValid = SystemZ::CCMASK_TBEGIN;
+    return true;
+
+  case Intrinsic::s390_tbegin_nofloat:
+    Opcode = SystemZISD::TBEGIN_NOFLOAT;
+    CCValid = SystemZ::CCMASK_TBEGIN;
+    return true;
+
+  case Intrinsic::s390_tend:
+    Opcode = SystemZISD::TEND;
+    CCValid = SystemZ::CCMASK_TEND;
+    return true;
+
+  default:
+    return false;
+  }
+}
+
+// Emit an intrinsic with chain with a glued value instead of its CC result.
+static SDValue emitIntrinsicWithChainAndGlue(SelectionDAG &DAG, SDValue Op,
+                                             unsigned Opcode) {
+  // Copy all operands except the intrinsic ID.
+  unsigned NumOps = Op.getNumOperands();
+  SmallVector<SDValue, 6> Ops;
+  Ops.reserve(NumOps - 1);
+  Ops.push_back(Op.getOperand(0));
+  for (unsigned I = 2; I < NumOps; ++I)
+    Ops.push_back(Op.getOperand(I));
+
+  assert(Op->getNumValues() == 2 && "Expected only CC result and chain");
+  SDVTList RawVTs = DAG.getVTList(MVT::Other, MVT::Glue);
+  SDValue Intr = DAG.getNode(Opcode, SDLoc(Op), RawVTs, Ops);
+  SDValue OldChain = SDValue(Op.getNode(), 1);
+  SDValue NewChain = SDValue(Intr.getNode(), 0);
+  DAG.ReplaceAllUsesOfValueWith(OldChain, NewChain);
+  return Intr;
+}
+
 // CC is a comparison that will be implemented using an integer or
 // floating-point comparison.  Return the condition code mask for
 // a branch on true.  In the integer case, CCMASK_CMP_UO is set for
@@ -1588,9 +1639,53 @@ static void adjustForTestUnderMask(Selec
   C.CCMask = NewCCMask;
 }
 
+// Return a Comparison that tests the condition-code result of intrinsic
+// node Call against constant integer CC using comparison code Cond.
+// Opcode is the opcode of the SystemZISD operation for the intrinsic
+// and CCValid is the set of possible condition-code results.
+static Comparison getIntrinsicCmp(SelectionDAG &DAG, unsigned Opcode,
+                                  SDValue Call, unsigned CCValid, uint64_t CC,
+                                  ISD::CondCode Cond) {
+  Comparison C(Call, SDValue());
+  C.Opcode = Opcode;
+  C.CCValid = CCValid;
+  if (Cond == ISD::SETEQ)
+    // bit 3 for CC==0, bit 0 for CC==3, always false for CC>3.
+    C.CCMask = CC < 4 ? 1 << (3 - CC) : 0;
+  else if (Cond == ISD::SETNE)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(1 << (3 - CC)) : -1;
+  else if (Cond == ISD::SETLT || Cond == ISD::SETULT)
+    // bits above bit 3 for CC==0 (always false), bits above bit 0 for CC==3,
+    // always true for CC>3.
+    C.CCMask = CC < 4 ? -1 << (4 - CC) : -1;
+  else if (Cond == ISD::SETGE || Cond == ISD::SETUGE)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(-1 << (4 - CC)) : 0;
+  else if (Cond == ISD::SETLE || Cond == ISD::SETULE)
+    // bit 3 and above for CC==0, bit 0 and above for CC==3 (always true),
+    // always true for CC>3.
+    C.CCMask = CC < 4 ? -1 << (3 - CC) : -1;
+  else if (Cond == ISD::SETGT || Cond == ISD::SETUGT)
+    // ...and the inverse of that.
+    C.CCMask = CC < 4 ? ~(-1 << (3 - CC)) : 0;
+  else
+    llvm_unreachable("Unexpected integer comparison type");
+  C.CCMask &= CCValid;
+  return C;
+}
+
 // Decide how to implement a comparison of type Cond between CmpOp0 with CmpOp1.
 static Comparison getCmp(SelectionDAG &DAG, SDValue CmpOp0, SDValue CmpOp1,
                          ISD::CondCode Cond) {
+  if (CmpOp1.getOpcode() == ISD::Constant) {
+    uint64_t Constant = cast<ConstantSDNode>(CmpOp1)->getZExtValue();
+    unsigned Opcode, CCValid;
+    if (CmpOp0.getOpcode() == ISD::INTRINSIC_W_CHAIN &&
+        CmpOp0.getResNo() == 0 && CmpOp0->hasNUsesOfValue(1, 0) &&
+        isIntrinsicWithCCAndChain(CmpOp0, Opcode, CCValid))
+      return getIntrinsicCmp(DAG, Opcode, CmpOp0, CCValid, Constant, Cond);
+  }
   Comparison C(CmpOp0, CmpOp1);
   C.CCMask = CCMaskForCondCode(Cond);
   if (C.Op0.getValueType().isFloatingPoint()) {
@@ -1632,6 +1727,17 @@ static Comparison getCmp(SelectionDAG &D
 
 // Emit the comparison instruction described by C.
 static SDValue emitCmp(SelectionDAG &DAG, SDLoc DL, Comparison &C) {
+  if (!C.Op1.getNode()) {
+    SDValue Op;
+    switch (C.Op0.getOpcode()) {
+    case ISD::INTRINSIC_W_CHAIN:
+      Op = emitIntrinsicWithChainAndGlue(DAG, C.Op0, C.Opcode);
+      break;
+    default:
+      llvm_unreachable("Invalid comparison operands");
+    }
+    return SDValue(Op.getNode(), Op->getNumValues() - 1);
+  }
   if (C.Opcode == SystemZISD::ICMP)
     return DAG.getNode(SystemZISD::ICMP, DL, MVT::Glue, C.Op0, C.Op1,
                        DAG.getConstant(C.ICmpType, MVT::i32));
@@ -1713,7 +1819,6 @@ SDValue SystemZTargetLowering::lowerSETC
 }
 
 SDValue SystemZTargetLowering::lowerBR_CC(SDValue Op, SelectionDAG &DAG) const {
-  SDValue Chain    = Op.getOperand(0);
   ISD::CondCode CC = cast<CondCodeSDNode>(Op.getOperand(1))->get();
   SDValue CmpOp0   = Op.getOperand(2);
   SDValue CmpOp1   = Op.getOperand(3);
@@ -1723,7 +1828,7 @@ SDValue SystemZTargetLowering::lowerBR_C
   Comparison C(getCmp(DAG, CmpOp0, CmpOp1, CC));
   SDValue Glue = emitCmp(DAG, DL, C);
   return DAG.getNode(SystemZISD::BR_CCMASK, DL, Op.getValueType(),
-                     Chain, DAG.getConstant(C.CCValid, MVT::i32),
+                     Op.getOperand(0), DAG.getConstant(C.CCValid, MVT::i32),
                      DAG.getConstant(C.CCMask, MVT::i32), Dest, Glue);
 }
 
@@ -2561,6 +2666,30 @@ SDValue SystemZTargetLowering::lowerPREF
                                  Node->getMemoryVT(), Node->getMemOperand());
 }
 
+// Return an i32 that contains the value of CC immediately after After,
+// whose final operand must be MVT::Glue.
+static SDValue getCCResult(SelectionDAG &DAG, SDNode *After) {
+  SDValue Glue = SDValue(After, After->getNumValues() - 1);
+  SDValue IPM = DAG.getNode(SystemZISD::IPM, SDLoc(After), MVT::i32, Glue);
+  return DAG.getNode(ISD::SRL, SDLoc(After), MVT::i32, IPM,
+                     DAG.getConstant(SystemZ::IPM_CC, MVT::i32));
+}
+
+SDValue
+SystemZTargetLowering::lowerINTRINSIC_W_CHAIN(SDValue Op,
+                                              SelectionDAG &DAG) const {
+  unsigned Opcode, CCValid;
+  if (isIntrinsicWithCCAndChain(Op, Opcode, CCValid)) {
+    assert(Op->getNumValues() == 2 && "Expected only CC result and chain");
+    SDValue Glued = emitIntrinsicWithChainAndGlue(DAG, Op, Opcode);
+    SDValue CC = getCCResult(DAG, Glued.getNode());
+    DAG.ReplaceAllUsesOfValueWith(SDValue(Op.getNode(), 0), CC);
+    return SDValue();
+  }
+
+  return SDValue();
+}
+
 SDValue SystemZTargetLowering::LowerOperation(SDValue Op,
                                               SelectionDAG &DAG) const {
   switch (Op.getOpcode()) {
@@ -2634,6 +2763,8 @@ SDValue SystemZTargetLowering::LowerOper
     return lowerSTACKRESTORE(Op, DAG);
   case ISD::PREFETCH:
     return lowerPREFETCH(Op, DAG);
+  case ISD::INTRINSIC_W_CHAIN:
+    return lowerINTRINSIC_W_CHAIN(Op, DAG);
   default:
     llvm_unreachable("Unexpected node to lower");
   }
@@ -2674,6 +2805,9 @@ const char *SystemZTargetLowering::getTa
     OPCODE(SEARCH_STRING);
     OPCODE(IPM);
     OPCODE(SERIALIZE);
+    OPCODE(TBEGIN);
+    OPCODE(TBEGIN_NOFLOAT);
+    OPCODE(TEND);
     OPCODE(ATOMIC_SWAPW);
     OPCODE(ATOMIC_LOADW_ADD);
     OPCODE(ATOMIC_LOADW_SUB);
@@ -3501,6 +3635,50 @@ SystemZTargetLowering::emitStringWrapper
   return DoneMBB;
 }
 
+// Update TBEGIN instruction with final opcode and register clobbers.
+MachineBasicBlock *
+SystemZTargetLowering::emitTransactionBegin(MachineInstr *MI,
+                                            MachineBasicBlock *MBB,
+                                            unsigned Opcode,
+                                            bool NoFloat) const {
+  MachineFunction &MF = *MBB->getParent();
+  const TargetFrameLowering *TFI = Subtarget.getFrameLowering();
+  const SystemZInstrInfo *TII = Subtarget.getInstrInfo();
+
+  // Update opcode.
+  MI->setDesc(TII->get(Opcode));
+
+  // We cannot handle a TBEGIN that clobbers the stack or frame pointer.
+  // Make sure to add the corresponding GRSM bits if they are missing.
+  uint64_t Control = MI->getOperand(2).getImm();
+  static const unsigned GPRControlBit[16] = {
+    0x8000, 0x8000, 0x4000, 0x4000, 0x2000, 0x2000, 0x1000, 0x1000,
+    0x0800, 0x0800, 0x0400, 0x0400, 0x0200, 0x0200, 0x0100, 0x0100
+  };
+  Control |= GPRControlBit[15];
+  if (TFI->hasFP(MF))
+    Control |= GPRControlBit[11];
+  MI->getOperand(2).setImm(Control);
+
+  // Add GPR clobbers.
+  for (int I = 0; I < 16; I++) {
+    if ((Control & GPRControlBit[I]) == 0) {
+      unsigned Reg = SystemZMC::GR64Regs[I];
+      MI->addOperand(MachineOperand::CreateReg(Reg, true, true));
+    }
+  }
+
+  // Add FPR clobbers.
+  if (!NoFloat && (Control & 4) != 0) {
+    for (int I = 0; I < 16; I++) {
+      unsigned Reg = SystemZMC::FP64Regs[I];
+      MI->addOperand(MachineOperand::CreateReg(Reg, true, true));
+    }
+  }
+
+  return MBB;
+}
+
 MachineBasicBlock *SystemZTargetLowering::
 EmitInstrWithCustomInserter(MachineInstr *MI, MachineBasicBlock *MBB) const {
   switch (MI->getOpcode()) {
@@ -3742,6 +3920,12 @@ EmitInstrWithCustomInserter(MachineInstr
     return emitStringWrapper(MI, MBB, SystemZ::MVST);
   case SystemZ::SRSTLoop:
     return emitStringWrapper(MI, MBB, SystemZ::SRST);
+  case SystemZ::TBEGIN:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, false);
+  case SystemZ::TBEGIN_nofloat:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGIN, true);
+  case SystemZ::TBEGINC:
+    return emitTransactionBegin(MI, MBB, SystemZ::TBEGINC, true);
   default:
     llvm_unreachable("Unexpected instr type to insert");
   }
Index: llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll
===================================================================
--- /dev/null
+++ llvm-head/test/CodeGen/SystemZ/htm-intrinsics.ll
@@ -0,0 +1,352 @@
+; Test transactional-execution intrinsics.
+;
+; RUN: llc < %s -mtriple=s390x-linux-gnu -mcpu=zEC12 | FileCheck %s
+
+declare i32 @llvm.s390.tbegin(i8 *, i32)
+declare i32 @llvm.s390.tbegin.nofloat(i8 *, i32)
+declare void @llvm.s390.tbeginc(i8 *, i32)
+declare i32 @llvm.s390.tend()
+declare void @llvm.s390.tabort(i64)
+declare void @llvm.s390.ntstg(i64, i64 *)
+declare i32 @llvm.s390.etnd()
+declare void @llvm.s390.ppa.txassist(i32)
+
+; TBEGIN.
+define void @test_tbegin() {
+; CHECK-LABEL: test_tbegin:
+; CHECK-NOT: stmg
+; CHECK: std %f8,
+; CHECK: std %f9,
+; CHECK: std %f10,
+; CHECK: std %f11,
+; CHECK: std %f12,
+; CHECK: std %f13,
+; CHECK: std %f14,
+; CHECK: std %f15,
+; CHECK: tbegin 0, 65292
+; CHECK: ld %f8,
+; CHECK: ld %f9,
+; CHECK: ld %f10,
+; CHECK: ld %f11,
+; CHECK: ld %f12,
+; CHECK: ld %f13,
+; CHECK: ld %f14,
+; CHECK: ld %f15,
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin(i8 *null, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat).
+define void @test_tbegin_nofloat1() {
+; CHECK-LABEL: test_tbegin_nofloat1:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat) with integer CC return value.
+define i32 @test_tbegin_nofloat2() {
+; CHECK-LABEL: test_tbegin_nofloat2:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  ret i32 %res
+}
+
+; TBEGIN (nofloat) with implicit CC check.
+define void @test_tbegin_nofloat3(i32 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat3:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: jnh  {{\.L*}}
+; CHECK: mvhi 0(%r2), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void
+}
+
+; TBEGIN (nofloat) with dual CC use.
+define i32 @test_tbegin_nofloat4(i32 %pad, i32 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat4:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65292
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: cijlh %r2, 2,  {{\.L*}}
+; CHECK: mvhi 0(%r3), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65292)
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret i32 %res
+}
+
+; TBEGIN (nofloat) with register.
+define void @test_tbegin_nofloat5(i8 *%ptr) {
+; CHECK-LABEL: test_tbegin_nofloat5:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0(%r2), 65292
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *%ptr, i32 65292)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0x0f00.
+define void @test_tbegin_nofloat6() {
+; CHECK-LABEL: test_tbegin_nofloat6:
+; CHECK: stmg %r6, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 3840
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 3840)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xf100.
+define void @test_tbegin_nofloat7() {
+; CHECK-LABEL: test_tbegin_nofloat7:
+; CHECK: stmg %r8, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 61696
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 61696)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfe00 -- stack pointer added automatically.
+define void @test_tbegin_nofloat8() {
+; CHECK-LABEL: test_tbegin_nofloat8:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65280
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 65024)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfb00 -- no frame pointer needed.
+define void @test_tbegin_nofloat9() {
+; CHECK-LABEL: test_tbegin_nofloat9:
+; CHECK: stmg %r10, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 64256
+; CHECK: br %r14
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 64256)
+  ret void
+}
+
+; TBEGIN (nofloat) with GRSM 0xfb00 -- frame pointer added automatically.
+define void @test_tbegin_nofloat10(i64 %n) {
+; CHECK-LABEL: test_tbegin_nofloat10:
+; CHECK: stmg %r11, %r15,
+; CHECK-NOT: std
+; CHECK: tbegin 0, 65280
+; CHECK: br %r14
+  %buf = alloca i8, i64 %n
+  call i32 @llvm.s390.tbegin.nofloat(i8 *null, i32 64256)
+  ret void
+}
+
+; TBEGINC.
+define void @test_tbeginc() {
+; CHECK-LABEL: test_tbeginc:
+; CHECK-NOT: stmg
+; CHECK-NOT: std
+; CHECK: tbeginc 0, 65288
+; CHECK: br %r14
+  call void @llvm.s390.tbeginc(i8 *null, i32 65288)
+  ret void
+}
+
+; TEND with integer CC return value.
+define i32 @test_tend1() {
+; CHECK-LABEL: test_tend1:
+; CHECK: tend
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  ret i32 %res
+}
+
+; TEND with implicit CC check.
+define void @test_tend3(i32 *%ptr) {
+; CHECK-LABEL: test_tend3:
+; CHECK: tend
+; CHECK: je  {{\.L*}}
+; CHECK: mvhi 0(%r2), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret void
+}
+
+; TEND with dual CC use.
+define i32 @test_tend2(i32 %pad, i32 *%ptr) {
+; CHECK-LABEL: test_tend2:
+; CHECK: tend
+; CHECK: ipm %r2
+; CHECK: srl %r2, 28
+; CHECK: cijlh %r2, 2,  {{\.L*}}
+; CHECK: mvhi 0(%r3), 0
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.tend()
+  %cmp = icmp eq i32 %res, 2
+  br i1 %cmp, label %if.then, label %if.end
+
+if.then:                                          ; preds = %entry
+  store i32 0, i32* %ptr, align 4
+  br label %if.end
+
+if.end:                                           ; preds = %if.then, %entry
+  ret i32 %res
+}
+
+; TABORT with register only.
+define void @test_tabort1(i64 %val) {
+; CHECK-LABEL: test_tabort1:
+; CHECK: tabort 0(%r2)
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 %val)
+  ret void
+}
+
+; TABORT with immediate only.
+define void @test_tabort2(i64 %val) {
+; CHECK-LABEL: test_tabort2:
+; CHECK: tabort 1234
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 1234)
+  ret void
+}
+
+; TABORT with register + immediate.
+define void @test_tabort3(i64 %val) {
+; CHECK-LABEL: test_tabort3:
+; CHECK: tabort 1234(%r2)
+; CHECK: br %r14
+  %sum = add i64 %val, 1234
+  call void @llvm.s390.tabort(i64 %sum)
+  ret void
+}
+
+; TABORT with out-of-range immediate.
+define void @test_tabort4(i64 %val) {
+; CHECK-LABEL: test_tabort4:
+; CHECK: tabort 0({{%r[1-5]}})
+; CHECK: br %r14
+  call void @llvm.s390.tabort(i64 4096)
+  ret void
+}
+
+; NTSTG with base pointer only.
+define void @test_ntstg1(i64 *%ptr, i64 %val) {
+; CHECK-LABEL: test_ntstg1:
+; CHECK: ntstg %r3, 0(%r2)
+; CHECK: br %r14
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with base and index.
+; Check that VSTL doesn't allow an index.
+define void @test_ntstg2(i64 *%base, i64 %index, i64 %val) {
+; CHECK-LABEL: test_ntstg2:
+; CHECK: sllg [[REG:%r[1-5]]], %r3, 3
+; CHECK: ntstg %r4, 0([[REG]],%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 %index
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with the highest in-range displacement.
+define void @test_ntstg3(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg3:
+; CHECK: ntstg %r3, 524280(%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 65535
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with an out-of-range positive displacement.
+define void @test_ntstg4(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg4:
+; CHECK: ntstg %r3, 0({{%r[1-5]}})
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 65536
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with the lowest in-range displacement.
+define void @test_ntstg5(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg5:
+; CHECK: ntstg %r3, -524288(%r2)
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 -65536
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; NTSTG with an out-of-range negative displacement.
+define void @test_ntstg6(i64 *%base, i64 %val) {
+; CHECK-LABEL: test_ntstg6:
+; CHECK: ntstg %r3, 0({{%r[1-5]}})
+; CHECK: br %r14
+  %ptr = getelementptr i64, i64 *%base, i64 -65537
+  call void @llvm.s390.ntstg(i64 %val, i64 *%ptr)
+  ret void
+}
+
+; ETND.
+define i32 @test_etnd() {
+; CHECK-LABEL: test_etnd:
+; CHECK: etnd %r2
+; CHECK: br %r14
+  %res = call i32 @llvm.s390.etnd()
+  ret i32 %res
+}
+
+; PPA (Transaction-Abort Assist)
+define void @test_ppa_txassist(i32 %val) {
+; CHECK-LABEL: test_ppa_txassist:
+; CHECK: ppa %r2, 0, 1
+; CHECK: br %r14
+  call void @llvm.s390.ppa.txassist(i32 %val)
+  ret void
+}
+
Index: llvm-head/test/MC/SystemZ/insn-bad-zEC12.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-bad-zEC12.s
+++ llvm-head/test/MC/SystemZ/insn-bad-zEC12.s
@@ -3,6 +3,22 @@
 # RUN: FileCheck < %t %s
 
 #CHECK: error: invalid operand
+#CHECK: ntstg	%r0, -524289
+#CHECK: error: invalid operand
+#CHECK: ntstg	%r0, 524288
+
+	ntstg	%r0, -524289
+	ntstg	%r0, 524288
+
+#CHECK: error: invalid operand
+#CHECK: ppa	%r0, %r0, -1
+#CHECK: error: invalid operand
+#CHECK: ppa	%r0, %r0, 16
+
+	ppa	%r0, %r0, -1
+	ppa	%r0, %r0, 16
+
+#CHECK: error: invalid operand
 #CHECK: risbgn	%r0,%r0,0,0,-1
 #CHECK: error: invalid operand
 #CHECK: risbgn	%r0,%r0,0,0,64
@@ -22,3 +38,47 @@
 	risbgn	%r0,%r0,-1,0,0
 	risbgn	%r0,%r0,256,0,0
 
+#CHECK: error: invalid operand
+#CHECK: tabort	-1
+#CHECK: error: invalid operand
+#CHECK: tabort	4096
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tabort	0(%r1,%r2)
+
+	tabort	-1
+	tabort	4096
+	tabort	0(%r1,%r2)
+
+#CHECK: error: invalid operand
+#CHECK: tbegin	-1, 0
+#CHECK: error: invalid operand
+#CHECK: tbegin	4096, 0
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tbegin	0(%r1,%r2), 0
+#CHECK: error: invalid operand
+#CHECK: tbegin	0, -1
+#CHECK: error: invalid operand
+#CHECK: tbegin	0, 65536
+
+	tbegin	-1, 0
+	tbegin	4096, 0
+	tbegin	0(%r1,%r2), 0
+	tbegin	0, -1
+	tbegin	0, 65536
+
+#CHECK: error: invalid operand
+#CHECK: tbeginc	-1, 0
+#CHECK: error: invalid operand
+#CHECK: tbeginc	4096, 0
+#CHECK: error: invalid use of indexed addressing
+#CHECK: tbeginc	0(%r1,%r2), 0
+#CHECK: error: invalid operand
+#CHECK: tbeginc	0, -1
+#CHECK: error: invalid operand
+#CHECK: tbeginc	0, 65536
+
+	tbeginc	-1, 0
+	tbeginc	4096, 0
+	tbeginc	0(%r1,%r2), 0
+	tbeginc	0, -1
+	tbeginc	0, 65536
Index: llvm-head/test/MC/SystemZ/insn-good-zEC12.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-good-zEC12.s
+++ llvm-head/test/MC/SystemZ/insn-good-zEC12.s
@@ -1,6 +1,48 @@
 # For zEC12 and above.
 # RUN: llvm-mc -triple s390x-linux-gnu -mcpu=zEC12 -show-encoding %s | FileCheck %s
 
+#CHECK: etnd	%r0                     # encoding: [0xb2,0xec,0x00,0x00]
+#CHECK: etnd	%r15                    # encoding: [0xb2,0xec,0x00,0xf0]
+#CHECK: etnd	%r7                     # encoding: [0xb2,0xec,0x00,0x70]
+
+	etnd	%r0
+	etnd	%r15
+	etnd	%r7
+
+#CHECK: ntstg	%r0, -524288            # encoding: [0xe3,0x00,0x00,0x00,0x80,0x25]
+#CHECK: ntstg	%r0, -1                 # encoding: [0xe3,0x00,0x0f,0xff,0xff,0x25]
+#CHECK: ntstg	%r0, 0                  # encoding: [0xe3,0x00,0x00,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 1                  # encoding: [0xe3,0x00,0x00,0x01,0x00,0x25]
+#CHECK: ntstg	%r0, 524287             # encoding: [0xe3,0x00,0x0f,0xff,0x7f,0x25]
+#CHECK: ntstg	%r0, 0(%r1)             # encoding: [0xe3,0x00,0x10,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 0(%r15)            # encoding: [0xe3,0x00,0xf0,0x00,0x00,0x25]
+#CHECK: ntstg	%r0, 524287(%r1,%r15)   # encoding: [0xe3,0x01,0xff,0xff,0x7f,0x25]
+#CHECK: ntstg	%r0, 524287(%r15,%r1)   # encoding: [0xe3,0x0f,0x1f,0xff,0x7f,0x25]
+#CHECK: ntstg	%r15, 0                 # encoding: [0xe3,0xf0,0x00,0x00,0x00,0x25]
+
+	ntstg	%r0, -524288
+	ntstg	%r0, -1
+	ntstg	%r0, 0
+	ntstg	%r0, 1
+	ntstg	%r0, 524287
+	ntstg	%r0, 0(%r1)
+	ntstg	%r0, 0(%r15)
+	ntstg	%r0, 524287(%r1,%r15)
+	ntstg	%r0, 524287(%r15,%r1)
+	ntstg	%r15, 0
+
+#CHECK: ppa	%r0, %r0, 0             # encoding: [0xb2,0xe8,0x00,0x00]
+#CHECK: ppa	%r0, %r0, 15            # encoding: [0xb2,0xe8,0xf0,0x00]
+#CHECK: ppa	%r0, %r15, 0            # encoding: [0xb2,0xe8,0x00,0x0f]
+#CHECK: ppa	%r4, %r6, 7             # encoding: [0xb2,0xe8,0x70,0x46]
+#CHECK: ppa	%r15, %r0, 0            # encoding: [0xb2,0xe8,0x00,0xf0]
+
+	ppa	%r0, %r0, 0
+	ppa	%r0, %r0, 15
+	ppa	%r0, %r15, 0
+	ppa	%r4, %r6, 7
+	ppa	%r15, %r0, 0
+
 #CHECK: risbgn	%r0, %r0, 0, 0, 0       # encoding: [0xec,0x00,0x00,0x00,0x00,0x59]
 #CHECK: risbgn	%r0, %r0, 0, 0, 63      # encoding: [0xec,0x00,0x00,0x00,0x3f,0x59]
 #CHECK: risbgn	%r0, %r0, 0, 255, 0     # encoding: [0xec,0x00,0x00,0xff,0x00,0x59]
@@ -17,3 +59,68 @@
 	risbgn	%r15,%r0,0,0,0
 	risbgn	%r4,%r5,6,7,8
 
+#CHECK: tabort	0                       # encoding: [0xb2,0xfc,0x00,0x00]
+#CHECK: tabort	0(%r1)                  # encoding: [0xb2,0xfc,0x10,0x00]
+#CHECK: tabort	0(%r15)                 # encoding: [0xb2,0xfc,0xf0,0x00]
+#CHECK: tabort	4095                    # encoding: [0xb2,0xfc,0x0f,0xff]
+#CHECK: tabort	4095(%r1)               # encoding: [0xb2,0xfc,0x1f,0xff]
+#CHECK: tabort	4095(%r15)              # encoding: [0xb2,0xfc,0xff,0xff]
+
+	tabort	0
+	tabort	0(%r1)
+	tabort	0(%r15)
+	tabort	4095
+	tabort	4095(%r1)
+	tabort	4095(%r15)
+
+#CHECK: tbegin	0, 0                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00]
+#CHECK: tbegin	4095, 0                 # encoding: [0xe5,0x60,0x0f,0xff,0x00,0x00]
+#CHECK: tbegin	0, 0                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x00]
+#CHECK: tbegin	0, 1                    # encoding: [0xe5,0x60,0x00,0x00,0x00,0x01]
+#CHECK: tbegin	0, 32767                # encoding: [0xe5,0x60,0x00,0x00,0x7f,0xff]
+#CHECK: tbegin	0, 32768                # encoding: [0xe5,0x60,0x00,0x00,0x80,0x00]
+#CHECK: tbegin	0, 65535                # encoding: [0xe5,0x60,0x00,0x00,0xff,0xff]
+#CHECK: tbegin	0(%r1), 42              # encoding: [0xe5,0x60,0x10,0x00,0x00,0x2a]
+#CHECK: tbegin	0(%r15), 42             # encoding: [0xe5,0x60,0xf0,0x00,0x00,0x2a]
+#CHECK: tbegin	4095(%r1), 42           # encoding: [0xe5,0x60,0x1f,0xff,0x00,0x2a]
+#CHECK: tbegin	4095(%r15), 42          # encoding: [0xe5,0x60,0xff,0xff,0x00,0x2a]
+
+	tbegin	0, 0
+	tbegin	4095, 0
+	tbegin	0, 0
+	tbegin	0, 1
+	tbegin	0, 32767
+	tbegin	0, 32768
+	tbegin	0, 65535
+	tbegin	0(%r1), 42
+	tbegin	0(%r15), 42
+	tbegin	4095(%r1), 42
+	tbegin	4095(%r15), 42
+
+#CHECK: tbeginc	0, 0                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00]
+#CHECK: tbeginc	4095, 0                 # encoding: [0xe5,0x61,0x0f,0xff,0x00,0x00]
+#CHECK: tbeginc	0, 0                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x00]
+#CHECK: tbeginc	0, 1                    # encoding: [0xe5,0x61,0x00,0x00,0x00,0x01]
+#CHECK: tbeginc	0, 32767                # encoding: [0xe5,0x61,0x00,0x00,0x7f,0xff]
+#CHECK: tbeginc	0, 32768                # encoding: [0xe5,0x61,0x00,0x00,0x80,0x00]
+#CHECK: tbeginc	0, 65535                # encoding: [0xe5,0x61,0x00,0x00,0xff,0xff]
+#CHECK: tbeginc	0(%r1), 42              # encoding: [0xe5,0x61,0x10,0x00,0x00,0x2a]
+#CHECK: tbeginc	0(%r15), 42             # encoding: [0xe5,0x61,0xf0,0x00,0x00,0x2a]
+#CHECK: tbeginc	4095(%r1), 42           # encoding: [0xe5,0x61,0x1f,0xff,0x00,0x2a]
+#CHECK: tbeginc	4095(%r15), 42          # encoding: [0xe5,0x61,0xff,0xff,0x00,0x2a]
+
+	tbeginc	0, 0
+	tbeginc	4095, 0
+	tbeginc	0, 0
+	tbeginc	0, 1
+	tbeginc	0, 32767
+	tbeginc	0, 32768
+	tbeginc	0, 65535
+	tbeginc	0(%r1), 42
+	tbeginc	0(%r15), 42
+	tbeginc	4095(%r1), 42
+	tbeginc	4095(%r15), 42
+
+#CHECK: tend                            # encoding: [0xb2,0xf8,0x00,0x00]
+
+	tend
Index: llvm-head/test/MC/SystemZ/insn-bad-z196.s
===================================================================
--- llvm-head.orig/test/MC/SystemZ/insn-bad-z196.s
+++ llvm-head/test/MC/SystemZ/insn-bad-z196.s
@@ -244,6 +244,11 @@
 	cxlgbr	%f0, 16, %r0, 0
 	cxlgbr	%f2, 0, %r0, 0
 
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: etnd	%r7
+
+	etnd	%r7
+
 #CHECK: error: invalid operand
 #CHECK: fidbra	%f0, 0, %f0, -1
 #CHECK: error: invalid operand
@@ -546,6 +551,16 @@
 	locr	%r0,%r0,-1
 	locr	%r0,%r0,16
 
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: ntstg	%r0, 524287(%r1,%r15)
+
+	ntstg	%r0, 524287(%r1,%r15)
+
+#CHECK: error: {{(instruction requires: processor-assist)?}}
+#CHECK: ppa	%r4, %r6, 7
+
+	ppa	%r4, %r6, 7
+
 #CHECK: error: {{(instruction requires: miscellaneous-extensions)?}}
 #CHECK: risbgn	%r1, %r2, 0, 0, 0
 
@@ -690,3 +705,24 @@
 	stocg	%r0,-524289,1
 	stocg	%r0,524288,1
 	stocg	%r0,0(%r1,%r2),1
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tabort	4095(%r1)
+
+	tabort	4095(%r1)
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tbegin	4095(%r1), 42
+
+	tbegin	4095(%r1), 42
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tbeginc	4095(%r1), 42
+
+	tbeginc	4095(%r1), 42
+
+#CHECK: error: {{(instruction requires: transactional-execution)?}}
+#CHECK: tend
+
+	tend
+
Index: llvm-head/test/MC/Disassembler/SystemZ/insns.txt
===================================================================
--- llvm-head.orig/test/MC/Disassembler/SystemZ/insns.txt
+++ llvm-head/test/MC/Disassembler/SystemZ/insns.txt
@@ -2503,6 +2503,15 @@
 # CHECK: ear %r15, %a15
 0xb2 0x4f 0x00 0xff
 
+# CHECK: etnd %r0
+0xb2 0xec 0x00 0x00
+
+# CHECK: etnd %r15
+0xb2 0xec 0x00 0xf0
+
+# CHECK: etnd %r7
+0xb2 0xec 0x00 0x70
+
 # CHECK: fidbr %f0, 0, %f0
 0xb3 0x5f 0x00 0x00
 
@@ -6034,6 +6043,36 @@
 # CHECK: ny %r15, 0
 0xe3 0xf0 0x00 0x00 0x00 0x54
 
+# CHECK: ntstg %r0, -524288
+0xe3 0x00 0x00 0x00 0x80 0x25
+
+# CHECK: ntstg %r0, -1
+0xe3 0x00 0x0f 0xff 0xff 0x25
+
+# CHECK: ntstg %r0, 0
+0xe3 0x00 0x00 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 1
+0xe3 0x00 0x00 0x01 0x00 0x25
+
+# CHECK: ntstg %r0, 524287
+0xe3 0x00 0x0f 0xff 0x7f 0x25
+
+# CHECK: ntstg %r0, 0(%r1)
+0xe3 0x00 0x10 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 0(%r15)
+0xe3 0x00 0xf0 0x00 0x00 0x25
+
+# CHECK: ntstg %r0, 524287(%r1,%r15)
+0xe3 0x01 0xff 0xff 0x7f 0x25
+
+# CHECK: ntstg %r0, 524287(%r15,%r1)
+0xe3 0x0f 0x1f 0xff 0x7f 0x25
+
+# CHECK: ntstg %r15, 0
+0xe3 0xf0 0x00 0x00 0x00 0x25
+
 # CHECK: oc 0(1), 0
 0xd6 0x00 0x00 0x00 0x00 0x00
 
@@ -6346,6 +6385,21 @@
 # CHECK: popcnt %r7, %r8
 0xb9 0xe1 0x00 0x78
 
+# CHECK: ppa %r0, %r0, 0
+0xb2 0xe8 0x00 0x00
+
+# CHECK: ppa %r0, %r0, 15
+0xb2 0xe8 0xf0 0x00
+
+# CHECK: ppa %r0, %r15, 0
+0xb2 0xe8 0x00 0x0f
+
+# CHECK: ppa %r4, %r6, 7
+0xb2 0xe8 0x70 0x46
+
+# CHECK: ppa %r15, %r0, 0
+0xb2 0xe8 0x00 0xf0
+
 # CHECK: risbg %r0, %r0, 0, 0, 0
 0xec 0x00 0x00 0x00 0x00 0x55
 
@@ -8062,6 +8116,93 @@
 # CHECK: sy %r15, 0
 0xe3 0xf0 0x00 0x00 0x00 0x5b
 
+# CHECK: tabort 0
+0xb2 0xfc 0x00 0x00
+
+# CHECK: tabort 0(%r1)
+0xb2 0xfc 0x10 0x00
+
+# CHECK: tabort 0(%r15)
+0xb2 0xfc 0xf0 0x00
+
+# CHECK: tabort 4095
+0xb2 0xfc 0x0f 0xff
+
+# CHECK: tabort 4095(%r1)
+0xb2 0xfc 0x1f 0xff
+
+# CHECK: tabort 4095(%r15)
+0xb2 0xfc 0xff 0xff
+
+# CHECK: tbegin 0, 0
+0xe5 0x60 0x00 0x00 0x00 0x00
+
+# CHECK: tbegin 4095, 0
+0xe5 0x60 0x0f 0xff 0x00 0x00
+
+# CHECK: tbegin 0, 0
+0xe5 0x60 0x00 0x00 0x00 0x00
+
+# CHECK: tbegin 0, 1
+0xe5 0x60 0x00 0x00 0x00 0x01
+
+# CHECK: tbegin 0, 32767
+0xe5 0x60 0x00 0x00 0x7f 0xff
+
+# CHECK: tbegin 0, 32768
+0xe5 0x60 0x00 0x00 0x80 0x00
+
+# CHECK: tbegin 0, 65535
+0xe5 0x60 0x00 0x00 0xff 0xff
+
+# CHECK: tbegin 0(%r1), 42
+0xe5 0x60 0x10 0x00 0x00 0x2a
+
+# CHECK: tbegin 0(%r15), 42
+0xe5 0x60 0xf0 0x00 0x00 0x2a
+
+# CHECK: tbegin 4095(%r1), 42
+0xe5 0x60 0x1f 0xff 0x00 0x2a
+
+# CHECK: tbegin 4095(%r15), 42
+0xe5 0x60 0xff 0xff 0x00 0x2a
+
+# CHECK: tbeginc 0, 0
+0xe5 0x61 0x00 0x00 0x00 0x00
+
+# CHECK: tbeginc 4095, 0
+0xe5 0x61 0x0f 0xff 0x00 0x00
+
+# CHECK: tbeginc 0, 0
+0xe5 0x61 0x00 0x00 0x00 0x00
+
+# CHECK: tbeginc 0, 1
+0xe5 0x61 0x00 0x00 0x00 0x01
+
+# CHECK: tbeginc 0, 32767
+0xe5 0x61 0x00 0x00 0x7f 0xff
+
+# CHECK: tbeginc 0, 32768
+0xe5 0x61 0x00 0x00 0x80 0x00
+
+# CHECK: tbeginc 0, 65535
+0xe5 0x61 0x00 0x00 0xff 0xff
+
+# CHECK: tbeginc 0(%r1), 42
+0xe5 0x61 0x10 0x00 0x00 0x2a
+
+# CHECK: tbeginc 0(%r15), 42
+0xe5 0x61 0xf0 0x00 0x00 0x2a
+
+# CHECK: tbeginc 4095(%r1), 42
+0xe5 0x61 0x1f 0xff 0x00 0x2a
+
+# CHECK: tbeginc 4095(%r15), 42
+0xe5 0x61 0xff 0xff 0x00 0x2a
+
+# CHECK: tend
+0xb2 0xf8 0x00 0x00
+
 # CHECK: tm 0, 0
 0x91 0x00 0x00 0x00
 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233803 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 12:51:43 +00:00
Jiangning Liu
3ee56c2c67 Fix PR23065. Avoid optimizing bitcast of build_vector with constant input to scalar_to_vector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233778 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 01:52:38 +00:00
Ahmed Bougacha
252c120f4a [SimplifyLibCalls] Ignore nobuiltin/unavailable fortified libcalls.
We used to do this before refactorings around r225640.
Some clang users checked for _chk libcall availability using:
  __has_builtin(__builtin___memcpy_chk)
When compiling with -fno-builtin, this is always true.
When passing -ffreestanding/-mkernel, which both imply -fno-builtin, we
end up with fortified libcalls, which isn't acceptable in a freestanding
environment which only provides their non-fortified counterparts.

Until we change clang and/or teach external users to check for availability
differently, disregard the "nobuiltin" attribute and TLI::has.

Workaround for PR23093.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233776 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 00:45:09 +00:00
Hal Finkel
54ab6ce385 [PowerPC] FastISel can't handle i1 return values when using CR bits
Under normal circumstances, use of CR bits is disabled when running at -O0, but
it is enabled by default otherwise, and if you have optnone functions, they'll
still generally be generated with crbits turned on (because nothing else turns
them off). FastISel can't handle most things dealing with i1 values when using
CR bits, and checks for that, but was not checking the return type on
functions; we can't fast-isel function calls with i1 return values either when
using CR bits for boolean values.

Fixes PR22664.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233775 91177308-0d34-0410-b5e6-96231b3b80d8
2015-04-01 00:40:48 +00:00
David Majnemer
64386621ec [WinEH] Generate .xdata for catch handlers
This lets us catch exceptions in simple cases.

N.B. Things that do not work include (but are not limited to):
- Throwing from within a catch handler.
- Catching an object with a named catch parameter.
- 'CatchHigh' is fictitious, we aren't sure of its purpose.
- We aren't entirely efficient with regards to the number of EH states
  that we generate.
- IP-to-State tables are sensitive to the order of emission.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233767 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 22:35:44 +00:00
Duncan P. N. Exon Smith
9ce8e7ea73 Verifier: Add a testcase for verifying type refs
r233664 fixed the `Verifier` so that it doesn't crash on bad type refs.
This deserves a test!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233756 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 20:57:56 +00:00
Hal Finkel
72cce21049 [PowerPC] Don't use a vector preferred memory type at -O0
Even at -O0, we fall back to SDAG when we hit intrinsics, and if the intrinsic
is a memset/memcpy/etc. we might normally use vector types. At -O0, this is
probably not a good idea (because, if there is a bug in the lowering code,
there would be no good way to turn it off). At -O0, only use scalar preferred
types.

Related to PR22754.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233755 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 20:56:09 +00:00
Quentin Colombet
6aebd393f0 [AArch64] Enable the codegenprepare optimization that promotes operation to form
extended loads.
Implement the related target lowering hook so that the optimization has a better
estimation of the cost of an extension.

rdar://problem/19267165


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233753 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 20:52:32 +00:00
Hal Finkel
20fc6ac99e [SDAG] Handle non-integer preferred memset types for non-constant values
The existing code in getMemsetValue only handled integer-preferred types when
the fill value was not a constant. Make this more robust in two ways:

  1. If the preferred type is a floating-point value, do the mul-splat trick on
     the corresponding integer type and then bitcast.
  2. If the preferred type is a vector, do the mul-splat trick on one vector
     element, and then build a vector out of them.

Fixes PR22754 (although, we should also turn off use of vector types at -O0).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233749 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 20:35:26 +00:00
Sanjay Patel
1b10376319 [X86, AVX] fix zero-extending integer operand load patterns to use integer instructions
This is a follow-on to r233704 and another partial fix for PR22685:
https://llvm.org/bugs/show_bug.cgi?id=22685



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233724 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 18:43:43 +00:00
Lang Hames
89ddc1b326 [Orc][MCJIT] Remove the small code model regression tests.
These regression tests are supposed to test small code model support, but have
been XFAIL'd because we don't have an in-tree memory manager that can guarantee
a small-code-model compatible memory layout. Unfortunately, they can
occasionally pass if they get lucky with memory allocation, causing unexpected
passes on the bots. That's not very helpful.

I'm going to remove these until we have the infrastructure (small-code-model
compatible memory manager) to run them properly.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233722 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 18:19:25 +00:00
Tim Northover
5b8131701d AArch64: fix v8.1 sqrdmlah tests on Darwin platforms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233709 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 16:41:38 +00:00
Sanjay Patel
7ea151449d [X86, AVX] try to lowerVectorShuffleAsElementInsertion() for all 256-bit vector sub-types
I suggested this change in D7898 (http://llvm.org/viewvc/llvm-project?view=revision&revision=231354)

It improves the v4i64 case although not optimally. This AVX codegen:

  vmovq {{.*#+}} xmm0 = mem[0],zero
  vxorpd %ymm1, %ymm1, %ymm1
  vblendpd {{.*#+}} ymm0 = ymm0[0],ymm1[1,2,3]

Becomes:

  vmovsd {{.*#+}} xmm0 = mem[0],zero

Unfortunately, this doesn't completely solve PR22685. There are still at least 2 problems under here:

    We're not handling v32i8 / v16i16.
    We're not getting the FP / int domains right for instruction selection.

But since this patch alone appears to do no harm, reduces code duplication, and helps v4i64, 
I'm submitting this patch ahead of fixing the above.

Differential Revision: http://reviews.llvm.org/D8341



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233704 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 16:32:11 +00:00
Krzysztof Parzyszek
4654bc762e Expand MUX instructions early on Hexagon
This time with all files included.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233696 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 13:35:12 +00:00
Krzysztof Parzyszek
b7c19b3cc9 Revert 233694. Weak SVN-fu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233695 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 13:32:32 +00:00
Krzysztof Parzyszek
af4ad2d843 Expand MUX instructions early on Hexagon
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233694 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 13:29:17 +00:00
Vladimir Sukharev
e99524cf52 [AArch64] Add v8.1a "Rounding Double Multiply Add/Subtract" extension
Reviewers: t.p.northover, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8502


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233693 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 13:15:48 +00:00
Ulrich Weigand
cb1b3ad4e1 [SystemZ] Support RISBGN instruction on zEC12
So far, we do not yet support any instruction specific to zEC12.
Most of the facilities added with zEC12 are indeed not very useful
to compiler code generation, but there is one exception: the
miscellaneous-extensions facility provides the RISBGN instruction,
which is a variant of RISBG that does not set the condition code.

Add support for this facility, MC support for RISBGN, and CodeGen
support for prefering RISBGN over RISBG on zEC12, unless we can
actually make use of the condition code set by RISBG.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233690 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 12:58:17 +00:00
Ulrich Weigand
ee84973420 [SystemZ] Use POPCNT instruction on z196
We already exploit a number of instructions specific to z196,
but not yet POPCNT.  Add support for the population-count
facility, MC support for the POPCNT instruction, CodeGen
support for using POPCNT, and implement the getPopcntSupport
TargetTransformInfo hook.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233689 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 12:56:33 +00:00
Ulrich Weigand
64aa9d8b4c [SystemZ] Provide basic TargetTransformInfo implementation
This hooks up the TargetTransformInfo machinery for SystemZ,
and provides an implementation of getIntImmCost.

In addition, the patch adds the isLegalICmpImmediate and
isLegalAddImmediate TargetLowering overrides, and updates
a couple of test cases where we now generate slightly
better code.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233688 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 12:52:27 +00:00
Rafael Espindola
378981499a Fix the operand encoding in the test instruction.
Fixes pr22995.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 12:31:55 +00:00
James Molloy
369ee1b1f4 [SDAG] Move TRUNCATE splitting logic into a helper, and use
it more liberally.

SplitVecOp_TRUNCATE has logic for recursively splitting oversize vectors
that need more than one round of splitting to become legal. There are many
other ISD nodes that could benefit from this logic, so factor it out and
use it for FP_TO_UINT,FP_TO_SINT,SINT_TO_FP,UINT_TO_FP and FTRUNC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233681 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 10:20:58 +00:00
Ahmed Bougacha
18128bd376 [X86] Generate MOVNT for all vector types.
We used to miss non-Q YMM integer vectors, and, non-Q/D XMM integer
vectors.
While there, change the v4i32 patterns to prefer MOVNTDQ.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233668 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 03:16:51 +00:00
Duncan P. N. Exon Smith
4e5fdbfc61 tools: Unify how verifyModule() is called
Unify the error messages for the various tools when `verifyModule()`
fails on an input module.  The "brave new way" is:

    lltool: path/to/input.ll: error: input module is broken!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233667 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 03:07:23 +00:00
Alexei Starovoitov
8ffc5ca532 [bpf] mark mov instructions as ReMaterializable
loading immediate into register is cheap, so take advantage of remat.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 02:49:58 +00:00
Duncan P. N. Exon Smith
c0bf4a0671 Verifier: Move more debug info checks away from Verify()
Most of these checks were already in the `Verifier` so this is more of a
cleanup.  Now almost everything is over there.

Now that require a `name:` for `MDGlobalVariable`, add a check in
`LLParser` for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 01:28:22 +00:00
Duncan P. N. Exon Smith
2f5cbb5947 Verifier: Move checks over from DIDescriptor::Verify()
Move over some more checks from `DIDescriptor::Verify()`, and change
`LLParser` to require non-null `file:` fields in compile units.

I've ignored the comment in test/Assembler/metadata-null-operands.ll
since I disagree with it.  At the time that test was written (r229960),
the debug info verifier wasn't on by default, so my comment there is in
the context of not expecting the verifier to be useful.  It is now, and
besides that, since r233394 we can check when parsing textual IR whether
an operand is null that shouldn't be.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233654 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 00:47:15 +00:00
Quentin Colombet
9e5f04d219 [AArch64] Fix poor codegen for add immediate.
We used to match the register variant before the immediate when the register
argument could be implicitly zero-extended.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233653 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-31 00:31:13 +00:00
David Majnemer
0a8ff297ad [WinEH] Run cleanup handlers when an exception is thrown
Generate tables in the .xdata section representing what actions to take
when an exception is thrown.  This currently fills in state for
cleanups, catch handlers are still unfinished.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233636 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 22:58:10 +00:00
Juergen Ributzka
1c8595529b Transfer implicit operands when expanding the RET_ReallyLR pseudo instruction.
When we expand the RET_ReallyLR pseudo instruction we also need to transfer the
implicit operands.

The return register is an implicit operand and without it the liveness
calculation generates an incorrect live-out set for the patchpoint.

This fixes rdar://problem/19068476.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233635 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 22:45:56 +00:00
Alexei Starovoitov
5ca9ec70fd [bpf] add support for bswap instructions
BPF has cpu_to_be and cpu_to_le instructions.
For now assume little endian and generate cpu_to_be for ISD::BSWAP.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233620 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 22:40:40 +00:00
Peter Collingbourne
07ee8d2fc1 MC: For variable symbols, maintain MCSymbol::Section as a cache.
This fixes the visibility of symbols in certain edge cases involving aliases
with multiple levels of indirection.

Fixes PR19582.

Differential Revision: http://reviews.llvm.org/D8586

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233595 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 20:41:21 +00:00
Paul Robinson
103d622517 Verify 'optnone' can run DAG combiner when appropriate.
Adds a test to verify the behavior that r233153 restored: 'optnone'
does not spuriously disable the DAG combiner, and in fact there are
cases where the DAG combiner must run (even at -O0 or 'optnone') in
order for codegen to succeed.

Differential Revision: http://reviews.llvm.org/D8614


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 19:37:44 +00:00
Justin Holewinski
e6d6461067 [NVPTX] Associate a minimum PTX version for each SM architecture
When a new SM architecture is introduced, it is only supported by the
current PTX version and later.  Make sure we are using at least the
minimum PTX version for the target architecture.

This also removes support for PTX ISA < 3.2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 19:30:55 +00:00
Justin Holewinski
d85053fc3f [NVPTX] Add options for PTX 4.1/4.2 and SM 3.2/3.7/5.2/5.3
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233575 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 18:12:50 +00:00
Duncan P. N. Exon Smith
7380257f0e Verifier: Add operand checks for remaining debug info
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233565 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 17:21:38 +00:00
Duncan P. N. Exon Smith
d3ec0ca17e Verifier: Add operand checks for MDLexicalBlock
Add operand checks for `MDLexicalBlock` and `MDLexicalBlockFile`.  Like
`MDLocalVariable` and `MDLocation`, these nodes always require a scope.

There was no test bitrot to fix here (just updated the serialization
tests in test/Assembler/mdlexicalblock.ll).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233561 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 16:37:48 +00:00
Duncan P. N. Exon Smith
0c251f74ef DebugInfo: Rename some testcases
Momentarily (but never in tree), the `scope:` field was called
`parent:`.  Apparently a few testcases were left behind with "parent" in
the name, so rename them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233560 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 16:21:28 +00:00
Simon Pilgrim
d8f2918479 [X86] Ensure integer domain on scalar i64 load/store stack folding tests. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 15:25:51 +00:00
Ulrich Weigand
5738b77176 [SystemZ] Fix LLVM crash on unoptimized code
Compiling the following function with -O0 would crash, since LLVM would
hit an assertion in getTestUnderMaskCond:

  int test(unsigned long x)
  {
    return x >= 0 && x <= 15;
  }

Fixed by detecting the case in the caller of getTestUnderMaskCond.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233541 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 13:46:59 +00:00
Daniel Sanders
35efeb5e16 [mips] Support 9-bit offsets for the 'R' inline assembly memory constraint.
Summary:
The 'R' constraint is actually supposed to be much more complicated than
this and is defined in terms of whether it will cause macro expansion in
the assembler. 'R' is getting less useful due to architecture changes and
ought to be replaced by other constraints. We therefore implement 9-bit
offsets which will work for all subtargets and all instructions.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8440


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233537 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 13:27:25 +00:00
Daniel Jasper
a4b389c125 Revert "[SCEV] Look at backedge dominating conditions."
This leads to terribly slow compile times under MSAN. More discussion
on the commit thread of r233447.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233529 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 09:30:02 +00:00
Elena Demikhovsky
10e73aeede AVX-512: blank lines, duplicated tests, no functional changes
see comments http://reviews.llvm.org/D6835


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233528 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 09:29:28 +00:00
Elena Demikhovsky
f5f12f1e92 AVX-512: added intrinsics for VPAND, VPOR and VPXOR
by Asaf Badouh (asaf.badouh@intel.com)


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233525 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-30 08:30:34 +00:00
Benjamin Kramer
155328790b [inline asm] Don't reject duplicated matching constraints
They're harmless and it's easy to generate them from clang, leading to
a crash in LLVM. Found by afl-fuzz.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-29 20:33:07 +00:00
Akira Hatanaka
0de206d8d6 [Objdump] Pass the correct subtarget to printInst.
This fixes a bug I introduced in r233411.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233484 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 20:44:05 +00:00
Hal Finkel
2eaf50f5fb [PowerPC] Add asm parser support for bitmask forms of rotate-and-mask instructions
The asm syntax for the 32-bit rotate-and-mask instructions can take a 32-bit
bitmask instead of an (mb, me) pair. This syntax is not specified in the Power
ISA manual, but is accepted by GNU as, and is documented in IBM's Assembler
Language Reference. The GNU Multiple Precision Arithmetic Library (gmp)
contains assembly that uses this syntax.

To implement this, I moved the isRunOfOnes utility function from
PPCISelDAGToDAG.cpp to PPCMCTargetDesc.h.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 19:42:41 +00:00
Hal Finkel
8360002904 [ConstantFold] Don't fold ppc_fp128 <-> int bitcasts
PPC_FP128 is really the sum of two consecutive doubles, where the first double
is always stored first in memory, regardless of the target endianness. The
memory layout of i128, however, depends on the target endianness, and so we
can't fold this without target endianness information. As a result, we must not
do this folding in lib/IR/ConstantFold.cpp (it could be done instead in
Analysis/ConstantFolding.cpp, but that's not done now).

Fixes PR23026.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233481 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 16:44:57 +00:00
Duncan P. N. Exon Smith
cefca10039 Verifier: Allow subroutine types to have no type array
Loosen one check from r233446: as long as `DIBuilder` requires a
non-null type for every subprogram, we should allow a null type array.
Also add tests for the rest of `MDSubroutineType`, which were somehow
missing.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 02:43:53 +00:00
Duncan P. N. Exon Smith
d397a52305 DebugInfo: Fix testcases with invalid MDSubprogram nodes
Fix testcases that don't pass the verifier after a WIP patch to check
`MDSubprogram` operands more effectively.  I found the following issues:

  - When `isDefinition: false`, the `variables:` field might point at
    `!{i32 786468}`, or at a tuple that pointed at an empty tuple with
    the comment "previously: invalid DW_TAG_base_type" (I vaguely recall
    adding those comments during an upgrade script).  In these cases, I
    just dropped the array.
  - The `variables:` field might point at something like `!{!{!8}}`,
    where `!8` was an `MDLocation`.  I removed the extra layer of
    indirection.
  - Invalid `type:` (not an `MDSubroutineType`).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-28 02:26:45 +00:00
Akira Hatanaka
57e9efecb0 [ARM] Enable changing instprinter's behavior based on the per-function
subtarget.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 23:41:42 +00:00
Sanjoy Das
e464bbfb5a [SCEV] Look at backedge dominating conditions.
Summary:
This change teaches ScalarEvolution::isLoopBackedgeGuardedByCond to look
at edges within the loop body that dominate the latch.  We don't do an
exhaustive search for all possible edges, but only a quick walk up the
dom tree.

Reviewers: atrick, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8627

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233447 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 23:18:08 +00:00
Duncan P. N. Exon Smith
de89228dfb Verifier: Call verifyModule() from llc and opt
Change `llc` and `opt` to run `verifyModule()`.  This ensures that we
check the full module before `FunctionPass::doInitialization()` ever
gets called (I was getting crashes in `DwarfDebug` instead of verifier
failures when testing a WIP patch that checks operands of compile
units).  In `opt`, also move up debug-info-stripping so that it still
runs before verification.

There was a fair bit of broken code that was sitting in tree.
Interestingly, some were cases of a `select` that referred to itself in
`-instcombine` tests (apparently an intermediate result).  I split them
off to `*-noverify.ll` tests with RUN lines like this:

    opt < %s -S -disable-verify -instcombine | opt -S | FileCheck %s

This avoids verifying the input file (so we can get the broken code into
`-instcombine), but still verifies the output with a second call to
`opt` (to verify that `-instcombine` will clean it up like it should).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 22:04:28 +00:00
Duncan P. N. Exon Smith
77a0728d96 DebugInfo: Fix bad debug info for compile units and types
Fix debug info in these tests, which started failing with a WIP patch to
verify compile units and types.  The problems look like they were all
caused by bitrot.  They fell into these categories:

  - Using `!{i32 0}` instead of `!{}`.
  - Using `!{null}` instead of `!{}`.
  - Using `!MDExpression()` instead of `!{}`.
  - Using `!8` instead of `!{!8}`.
  - `file:` references that pointed at `MDCompileUnit`s instead of the
    same `MDFile` as the compile unit.
  - `file:` references that were numerically off-by-one or (off-by-ten).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233415 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:46:33 +00:00
Ahmed Bougacha
81cd5e82a8 [R600/SI] Fix testcase check line.
Missing colon, instruction typo.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233414 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:41:42 +00:00
Akira Hatanaka
f2f2ef70a0 [AArch64InstPrinter] Use the feature bits of the subtarget passed to the print
method.

This enables the instprinter to print a different system register name based on
the feature bits of the per-function subtarget. 

Differential Revision: http://reviews.llvm.org/D8668 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233412 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:37:20 +00:00
Ahmed Bougacha
19e2fce680 [CodeGen] Don't attempt a tail-call with a non-forwarded explicit sret.
Tailcalls are only OK with forwarded sret pointers. With explicit sret,
one approximation is to check that the pointer isn't an Instruction, as
in that case it might point into some local memory (alloca). That's not
OK with tailcalls.

Explicit sret counterpart to r233409.
Differential Revison: http://reviews.llvm.org/D8510


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233410 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:35:49 +00:00
Ahmed Bougacha
2615b686d3 [CodeGen] Don't attempt a tail-call with implicit sret.
Tailcalls are only OK with forwarded sret pointers. With sret demotion,
they're not, as we'd have a pointer into a soon-to-be-dead stack frame.

Differential Revison: http://reviews.llvm.org/D8510


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233409 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 20:28:30 +00:00
Alexei Starovoitov
4193093152 [bpf] add support for bpf pseudo instruction
Expose bpf pseudo load instruction via intrinsic. It is used by front-ends that
can encode file descriptors directly into IR instead of relying on relocations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233396 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 18:51:42 +00:00
Quentin Colombet
1ad854adba [RegisterCoalescer] Refine the terminal rule to still consider the terminal
nodes.
When a node is terminal it is pushed at the end of the list of the copies to
coalesce instead of being completely ignored. In effect, this reduces its
priority over non-terminal nodes.

Because of that, we do not miss the rematerialization opportunities, nor the
copies that can be merged with more complex, than the terminal rule,
interference checks.

Related to PR22768.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233395 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 18:37:15 +00:00
Duncan P. N. Exon Smith
2cee1c9d3c LLParser: Require non-null scope for MDLocation and MDLocalVariable
Change `LLParser` to require a non-null `scope:` field for both
`MDLocation` and `MDLocalVariable`.  There's no need to wait for the
verifier for this check.  This also allows their `::getImpl()` methods
to assert that the incoming scope is non-null.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 17:56:39 +00:00
Adrian Prantl
1fa94d6f92 Add a -raw option to the -section mode of llvm-objdump.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233390 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 17:31:15 +00:00
Duncan P. N. Exon Smith
a9902daa5c Verifier: Check fields of MDVariable subclasses
Check fields from `MDLocalVariable` and `MDGlobalVariable` and change
the accessors to downcast to the right types.  `getType()` still returns
`Metadata*` since it could be an `MDString`-based reference.

Since local variables require non-null scopes, I also updated `LLParser`
to require a `scope:` field.

A number of testcases had grown bitrot and started failing with this
patch; I committed them separately in r233349.  If I just broke your
out-of-tree testcases, you're probably hitting similar problems (so have
a look there).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233389 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 17:29:58 +00:00
Duncan P. N. Exon Smith
2ee615297c DebugInfo: Fix another bitrotted testcase
Fix another case of a missing `scope:` field on an `MDLocalVariable`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233388 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 17:29:11 +00:00
Rafael Espindola
ac406dfb70 Work around pr23045 and make it easier to reproduce.
Dropping old debug format requires the entire module to be read upfront.

This was failing only with the gold plugin, but that is just because
llvm-link was not upgrading metadata.

The new testcase using llvm-link shows the problem.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233381 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 15:55:06 +00:00
Rafael Espindola
121eb4257d Close unique sections when switching away from them.
It is not possible to switch back to unique secitons, so close them
automatically when switching away.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233380 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 15:01:40 +00:00
Yaron Keren
407da8c267 Fix subprogram-linkonce-weak.ll and subprogram-linkonce-weak-odr.ll for Windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233375 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 13:52:12 +00:00
James Molloy
fb45b9fafc Reapply r233175 and r233183: float2int.
This re-adds float2int to the tree, after fixing PR23038. It turns
out the argument to APSInt() is true-if-unsigned, rather than
true-if-signed :(. Added testcase and explanatory comment.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233370 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 10:36:57 +00:00
Andrew Trick
9217916725 Complete the MachineScheduler fix made way back in r210390.
"Fix the MachineScheduler's logic for updating ready times for in-order.
 Now the scheduler updates a node's ready time as soon as it is
 scheduled, before releasing dependent nodes."

This fix was only made in one variant of the ScheduleDAGMI driver.
Francois de Ferriere reported the issue in the other bit of code where
it was also needed.
I never got around to coming up with a test case, but it's an
obvious fix that shouldn't be delayed any longer.
I'll try to refactor this code a little better.

I did verify performance on a wide variety of targets and saw no
negative impact with this fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 06:10:13 +00:00
Philip Reames
76acab3fc7 Require a GC strategy be specified for functions which use gc.statepoint
This was discussed a while back and I left it optional for migration.  Since it's been far more than the 'week or two' that was discussed, time to actually make this manditory.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 05:09:33 +00:00
Philip Reames
28ffcd3f1e Allow explicit spill slots to be specified for a gc.statepoint
This patch adds support for explicitly provided spill slots in the GC arguments of a gc.statepoint.  This is somewhat analogous to gcroot, but leverages the STATEPOINT MI node and StackMap infrastructure.  The motivation for this is:
1) The stack spilling code for gc.statepoints hasn't advanced as fast as I'd like.  One major option is to give up on doing spilling in the backend and do it at the IR level instead.  We'd give up the ability to have gc values in registers, but that's a minor cost in practice.  We are not neccessarily moving in that direction, but having the ability to prototype such a thing cheaply is interesting.
2) I want to port the gcroot lowering to use the statepoint infastructure.  Given the metadata printers for gcroot expect a fixed set of stack roots, it's easiest to just reuse the explicit stack slots and pass them directly to the underlying statepoint.  

I'm holding off on the documentation for the new feature until I'm reasonable sure this is going to stick around.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233356 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:52:48 +00:00
Andrew Trick
b827c4b923 Reintroduce the SelectionDAG scheduler test for r233351.
This test returns nonnative integer types which aren't supported on all targets.
The real issue with the SelectionDAG scheduler is with x86 EFLAGS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233355 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:42:52 +00:00
David Majnemer
db573736fd WinEH: Create a parent frame alloca for HandlerType xdata tables
We don't have any logic to emit those tables yet, so the SDAG lowering
of this intrinsic is just a stub.  We can see the intrinsic in the
prepared IR, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233354 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:17:07 +00:00
Andrew Trick
6284bcd95b This test should have been target specific. I missed that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233353 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 04:04:35 +00:00
Andrew Trick
18fc42ab12 Fix a bug in SelectionDAG scheduling backtracking code: PR22304.
It can happen (by line CurSU->isPending = true; // This SU is not in
AvailableQueue right now.) that a SUnit is mark as available but is
not in the AvailableQueue. For SUnit being selected for scheduling
both conditions must be met.

This patch mainly defensively protects from invalid removing a node
from a queue. Sometimes nodes are marked isAvailable but are not in
the queue because they have been defered due to some hazard.

Patch by Pawel Bylica!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233351 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 03:44:13 +00:00
Nick Lewycky
b3ad90eacc Revert r233175 and r233183 with it. This pulls float2int back out of the tree, due to PR23038.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233350 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 02:00:11 +00:00
Duncan P. N. Exon Smith
97fcbf1874 DebugInfo: Update testcases with invalid variables
Fix testcases whose variables are invalid.  I'm working on a patch that
adds `Verifier` checks for `MDLocalVariable` (and `MDGlobalVariable`),
and these failed because:

  - `scope:` fields need to point at `MDLocalScope` and can't be null.
  - `file:` fields need to point at `MDFile`.
  - `inlinedAt:` fields need to point at `MDLocation`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233349 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 01:58:34 +00:00
Ahmed Bougacha
f265603512 [AsmPrinter] Don't assert on GOT equivalent non-constant users.
We used to dyn_cast<Constant> in the recursive call, but cast<> in the
initial one, and there can be non-Constant initial users.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233346 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-27 01:40:54 +00:00
Derek Schuff
32b33c2c30 Use movw/movt instead of constant pool loads to lower byval parameter copies
Summary:
The ARM backend can use a loop to implement copying byval parameters before
a call. In non-thumb2 mode it uses a constant pool load to materialize the
trip count. For targets that need movt instead (e.g. Native Client), use
the same code as in thumb2 mode to materialize the trip count.

Reviewers: jfb, t.p.northover

Differential Revision: http://reviews.llvm.org/D8442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 22:11:00 +00:00
Duncan P. N. Exon Smith
c4eafd24f2 Verifier: Check accessors of MDLocation
Check accessors of `MDLocation`, and change them to `cast<>` down to the
right types.  Also add type-safe factory functions.

All the callers that handle broken code need to use the new versions of
the accessors (`getRawScope()` instead of `getScope()`) that still
return `Metadata*`.  This is also necessary for things like
`MDNodeKeyImpl<MDLocation>` (in LLVMContextImpl.h) that need to unique
the nodes when their operands might still be forward references of the
wrong type.

In the `Value` hierarchy, consumers that handle broken code use
`getOperand()` directly.  However, debug info nodes have a ton of
operands, and their order (even their existence) isn't stable yet.  It's
safer and more maintainable to add an explicit "raw" accessor on the
class itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233322 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 22:05:04 +00:00
Rafael Espindola
9d2f0da138 Fix PR23025.
There is something in link.exe that requires a relocation to use a
global symbol. Not doing so breaks the chrome build on windows.

This patch sets isWeak for that to work. To compensate,
we then need to look past those symbols when not creating relocations.

This patch includes an ELF test that matches GNU as behaviour.

I am still reducing the chrome build issue and will add a test
once that is done.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233318 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 21:11:00 +00:00
Justin Bogner
55a5cb1a4d [ARM] Fix some non-portable shell syntax in r233301's tests
The "|&" operator isn't POSIX, so it can fail depending on the host's
default shell. Avoid it.

There were also a couple of places that did "2>1", but this creates a
file called "1". They clearly meant "2>&1".

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 19:24:13 +00:00
Duncan P. N. Exon Smith
387a89bb36 Reapply "Linker: Drop function pointers for overridden subprograms"
This reverts commit r233254, effectively reapplying r233164 (and its
successors), with an additional testcase for when subprograms match
exactly.  This fixes PR22792 (again).

I'm using the same approach, but I've moved up the call to
`stripReplacedSubprograms()`.  The function pointers need to be dropped
before mapping any metadata from the source module, or else this can
drop the function from new subprograms that have merged (via Metadata
uniquing) with the old ones.  Dropping the pointers first prevents them
from merging.

**** The original commit message follows. ****

Linker: Drop function pointers for overridden subprograms

Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`.  This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.

The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`?  Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.

Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists.  A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.

Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out.  We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.

This still isn't completely satisfactory.  Two flaws at least that I can
think of:

  - I still haven't found a straightforward way to make this symmetric
    in the IR.  (Interestingly, the DWARF output is already symmetric,
    and I've tested for that to be sure we don't regress.)
  - Using `DebugInfoFinder` to find the canonical subprogram for a
    function is kind of crazy.  We should just attach metadata to the
    function, like this:

        define weak i32 @foo(i32, i32) !dbg !MDSubprogram(...) {

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233302 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 18:35:30 +00:00
Vladimir Sukharev
c8a807c1c5 [ARM] Add v8.1a "Rounding Double Multiply Add/Subtract" extension
Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8503


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233301 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 18:29:02 +00:00
Sanjoy Das
a9228aa73b [SCEV] Revert bailout added in r75511.
Summary:
With the introduction of MarkPendingLoopPredicates in r157092, I don't
think the bailout is needed anymore.

Reviewers: atrick, nicholas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8624

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233296 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:28:26 +00:00
Benjamin Kramer
8f820e9495 InstCombine: fold (A << C) == (B << C) --> ((A^B) & (~0U >> C)) == 0
Anding and comparing with zero can be done in a single instruction on
most archs so this is a bit cheaper.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233291 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:12:06 +00:00
Vladimir Sukharev
27d12f3e6e [AArch64, ARM] Add v8.1a architecture and generic cpu
New architecture and cpu added, following http://community.arm.com/groups/processors/blog/2014/12/02/the-armv8-a-architecture-and-its-ongoing-development

Reviewers: t.p.northover

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8505


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233290 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 17:05:54 +00:00
Jingyue Wu
439d17ca9f [SLSR] handle candidate form &B[i * S]
Summary:
This patch enhances SLSR to handle another candidate form &B[i * S]. If
we found two candidates

S1: X = &B[i * S]
S2: Y = &B[i' * S]

and S1 dominates S2, we can replace S2 with

Y = &X[(i' - i) * S]

Test Plan:
slsr-gep.ll
X86/no-slsr.ll: verify that we do not run SLSR on GEPs that already fit into
an addressing mode

Reviewers: eliben, atrick, meheff, hfinkel

Reviewed By: hfinkel

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D7459

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233286 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 16:49:24 +00:00
Andrea Di Biagio
971eb1b382 [X86][FastIsel] Teach how to select vector load instructions.
This patch teaches fast-isel how to select 128-bit vector load instructions.
Added test CodeGen/X86/fast-isel-vecload.ll

Differential Revision: http://reviews.llvm.org/D8605


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233270 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 11:29:02 +00:00
Duncan P. N. Exon Smith
a977f7e771 Revert "Linker: Drop function pointers for overridden subprograms"
This reverts commit r233164 and its testcase follow-ups in r233165,
r233207, r233214, and r233221.  It apparently unleashed an LTO bootstrap
failure, at least on Darwin:

http://lab.llvm.org:8080/green/job/clang-stage2-configure-Rlto_build/3376/

I'm reproducing now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233254 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 05:27:45 +00:00
Duncan P. N. Exon Smith
7a1a3e909d bugpoint: Verify input files
Like r233229 for `llvm-link`, start verifying input files to `bugpoint`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233253 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 05:03:10 +00:00
Quentin Colombet
45f376c4e6 [RegisterCoalescer] Add a rule to consider more profitable copies first when
those are in the same basic block.
The previous approach was the topological order of the basic block.

By default this rule is disabled.

Related to PR22768.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233241 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 01:01:48 +00:00
Eric Christopher
dd9bb11cfc Testcase for r233239.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233240 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-26 00:57:33 +00:00
Duncan P. N. Exon Smith
6141f178a6 llvm-link: Verify input modules
Otherwise, broken input modules can cause assertions.  I've updated two
of the testcases that started failing (modules that had `Require` flags
but didn't meet their own requirements), but Rafael and I decided that
test/Linker/2011-08-22-ResolveAlias.ll should just be deleted outright
-- it's a leftover of the way llvm-gcc used to implement weakref.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233229 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 23:22:10 +00:00
Sanjoy Das
93d88192a2 [ValueTracking] Fix PR23011.
Summary:
`ComputeNumSignBits` returns incorrect results for `srem` instructions.
This change fixes the issue and adds a test case.

Reviewers: nadav, nicholas, atrick

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8600

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233225 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 22:33:53 +00:00
Simon Pilgrim
eb32048f80 [DAGCombiner] Add support for TRUNCATE + FP_EXTEND vector constant folding
This patch adds supports for the vector constant folding of TRUNCATE and FP_EXTEND instructions and tidies up the SINT_TO_FP and UINT_TO_FP instructions to match.

It also moves the vector constant folding for the FNEG and FABS instructions to use the DAG.getNode() functionality like the other unary instructions.

Differential Revision: http://reviews.llvm.org/D8593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233224 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 22:30:31 +00:00
Duncan P. N. Exon Smith
00aafa5d73 Linker: Stop using -gmlt test/Linker/subprogram-linkonce-weak.ll
As dblaikie pointed out, if I stop setting `emissionKind: 2` then the
backend won't do magical things on Linux vs. Darwin.  I had wrongly
assumed that there were stricter requirements on the input if we weren't
in line-tables-only mode, but apparently not.

With that knowledge, clean up this testcase a little more.

  - Set `emissionKind: 1`.
  - Add back checks for the weak version of @foo.
  - Check more robustly that we have the right subprograms by checking
    the `DW_AT_decl_file` and `DW_AT_decl_line` which now show up.
  - Check the line table in isolation (since it's no longer doubling as
    an indirect test for the subprogram of the weak version of @foo).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233221 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 21:36:41 +00:00
Duncan P. N. Exon Smith
3f972ab1bf Linker: Loosen checks slightly from r233207
According to at least one bot [1], function prologues aren't always
empty for these functions.  Skip that part of the follow-up check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233214 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 20:51:21 +00:00
Reid Kleckner
a8b46e7055 WinEH: Create an unwind help alloca for __CxxFrameHandler3 xdata tables
We don't have any logic to emit those tables yet, so the sdag lowering
of this intrinsic is just a stub. We can see the intrinsic in the
prepared IR, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233209 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 20:10:36 +00:00
Duncan P. N. Exon Smith
151bc36220 Linker: Rewrite dwarfdump checks from r233164
Rewrite the checks from r233164 that I temporarily disabled in r233165.

It turns out that the line-tables only debug info we emit from `llc` is
(intentionally) different on Linux than on Darwin.  r218129 started
skipping emission of subprograms with no inlined subroutines, and
r218702 was a spiritual revert of that behaviour for Darwin.

I think we can still test this in a platform-neutral way.

  - Stop checking for the possibly missing `DW_TAG_subprogram` defining
    the debug info for the real version of `@foo`.
  - Start checking the line tables, ensuring that the right debug info
    was used to generate them (grabbing `DW_AT_low_pc` from the compile
    unit).
  - I changed up the line numbers used in the "weak" version so it's
    easier to follow.

This should hopefully finish off PR22792.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233207 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 19:57:42 +00:00
Kit Barton
bd9a548881 Add Hardware Transactional Memory (HTM) Support
This patch adds Hardware Transaction Memory (HTM) support supported by ISA 2.07
(POWER8). The intrinsic support is based on GCC one [1], but currently only the
'PowerPC HTM Low Level Built-in Function' are implemented.

The HTM instructions follows the RC ones and the transaction initiation result
is set on RC0 (with exception of tcheck). Currently approach is to create a
register copy from CR0 to GPR and comapring. Although this is suboptimal, since
the branch could be taken directly by comparing the CR0 value, it generates code
correctly on both test and branch and just return value. A possible future
optimization could be elimitate the MFCR instruction to branch directly.

The HTM usage requires a recently newer kernel with PPC HTM enabled. Tested on
powerpc64 and powerpc64le.

This is send along a clang patch to enabled the builtins and option switch.

[1] https://gcc.gnu.org/onlinedocs/gcc/PowerPC-Hardware-Transactional-Memory-Built-in-Functions.html

Phabricator Review: http://reviews.llvm.org/D8247


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233204 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 19:36:23 +00:00
Peter Collingbourne
489e5e5a86 Simplify missing-file-line.ll test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233201 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:58:09 +00:00
Peter Collingbourne
c33c447af9 DebugInfo: Permit DW_TAG_structure_type, DW_TAG_member, DW_TAG_typedef tags with empty file names.
Some languages, such as Go, have pre-defined structure types (e.g. "string"
is essentially a pointer/length pair) or pre-defined "typedef" types
(e.g. "error" is essentially a typedef for a specific interface type).
Such types do not have associated source location, so a Go frontend would
be correct not to associate a file name with such types.

This change relaxes the DIType verifier to permit unlocated types with
these tags.

Differential Revision: http://reviews.llvm.org/D8588

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233200 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:44:49 +00:00
Sanjay Patel
e53dbeb2ad [X86, AVX] improve insertion into zero element of 256-bit vector
This patch allows AVX blend instructions to handle insertion into the low
element of a 256-bit vector for the appropriate data types.

For f32, instead of:

   vblendps	$1, %xmm1, %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[1,2,3]
   vblendps	$15, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1,2,3],ymm0[4,5,6,7]

we get:

   vblendps	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3,4,5,6,7]

For f64, instead of:

   vmovsd	%xmm1, %xmm0, %xmm1     ## xmm1 = xmm1[0],xmm0[1]
   vblendpd	$3, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0,1],ymm0[2,3]

we get:

   vblendpd	$1, %ymm1, %ymm0, %ymm0 ## ymm0 = ymm1[0],ymm0[1,2,3]

For the hardware-neglected integer data types, I left a TODO comment in the
code and added regression tests for a follow-on patch.

Differential Revision: http://reviews.llvm.org/D8609



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233199 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:36:01 +00:00
Sanjay Patel
62dd074fe9 use update_llc_test_checks.py to tighten checking in these tests
1. There were no CHECK-LABELs, so we could match instructions from the wrong function.
2. The use of zero operands meant multiple xor instructions could match some CHECKs.
3. The test was over-specified to need a Sandybridge CPU and Darwin triple.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233198 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 17:34:11 +00:00
Daniel Jasper
a427f12c6d Make exit-code test use same mechanism as existing one.
The other version doesn't properly work with our internal test runner,
which sets pipefail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233188 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 14:35:40 +00:00
Rafael Espindola
d033c4b576 Fix fixup evaluation when deciding what to relocate with.
The previous logic was to first try without relocations at all
and failing that stop on the first defined symbol.

That was inefficient and incorrect in the case part of the
expression could be simplified and another part could not
(see included test).

We now stop the evaluation when we get to a variable whose value
can change (i.e. is weak).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233187 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 13:16:53 +00:00
Andrea Di Biagio
962dcdfdda [optnone] Skip pass Float2Int on optnone functions.
Added test Float2Int/float2int-optnone.ll to verify that pass Float2Int
is not run on optnone functions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233183 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 12:22:37 +00:00
Lang Hames
47fd5639bc [Orc][lli] Add a very simple Orc-based lazy JIT to lli.
This ensures that we're building and testing the CompileOnDemand layer, at least
in a basic way.

Currently x86-64 only, and with limited to no library calls enabled (depending
on host platform). Patches welcome. ;)

To enable access to the lazy JIT, this patch replaces the '-use-orcmcjit' lli
option with a new option:
'-jit-kind={ mcjit | orc-mcjit | orc-lazy }'.

All regression tests are updated to use the new option, and one trivial test of
the new lazy JIT is added.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233182 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 12:11:48 +00:00
Andrea Di Biagio
3132115738 [X86] Simplify check lines in tests. No functional change.
Also, removed unused check lines from test atomic6432.ll.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233181 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 11:44:19 +00:00
James Molloy
09f1b672cb Reapply r233062: "float2int": Add a new pass to demote from float to int where possible.
Now with a fix for PR23008 and extra regression test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233175 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 10:03:42 +00:00
Justin Bogner
c0ffe0f061 test: Fix the dependencies for the check-llvm-* targets
In r233009 we gained specific check-llvm-* build targets for invoking
specific parts of the test suite, but they were copying the
dependencies for check-all, rather than just listing the dependencies
for check-llvm.

This moves the creation of these targets next to the check-llvm
target, and uses that target's configuration rather than the check-all
config.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233174 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 08:07:47 +00:00
Duncan P. N. Exon Smith
2186876e87 Linker: Temporarily disable dwarfdump checks from r233164
At least one Linux bot [1] doesn't like my dwarfdump checks, so I've
disable those until I can investigate what's going on there.  I'll
continue to track this in PR22792.

[1]: http://bb.pgr.jp/builders/cmake-llvm-x86_64-linux/builds/22863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233165 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 02:43:04 +00:00
Duncan P. N. Exon Smith
c89369a941 Linker: Drop function pointers for overridden subprograms
Instead of dropping subprograms that have been overridden, just set
their function pointers to `nullptr`.  This is a minor adjustment to the
stop-gap fix for PR21910 committed in r224487, and fixes the crasher
from PR22792.

The problem that r224487 put a band-aid on: how do we find the canonical
subprogram for a `Function`?  Since the backend currently relies on
`DebugInfoFinder` (which does a naive in-order traversal of compile
units and picks the first subprogram) for this, r224487 tried dropping
non-canonical subprograms.

Dropping subprograms fails because the backend *also* builds up a map
from subprogram to compile unit (`DwarfDebug::SPMap`) based on the
subprogram lists.  A missing subprogram causes segfaults later when an
inlined reference (such as in this testcase) is created.

Instead, just drop the `Function` pointer to `nullptr`, which nicely
mirrors what happens when an already-inlined `Function` is optimized
out.  We can't really be sure that it's the same definition anyway, as
the testcase demonstrates.

This still isn't completely satisfactory.  Two flaws at least that I can
think of:

  - I still haven't found a straightforward way to make this symmetric
    in the IR.  (Interestingly, the DWARF output is already symmetric,
    and I've tested for that to be sure we don't regress.)
  - Using `DebugInfoFinder` to find the canonical subprogram for a
    function is kind of crazy.  We should just attach metadata to the
    function, like this:

        define weak i32 @foo(i32, i32) !dbg !MDSubprogram(...) {

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233164 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 02:26:32 +00:00
Rafael Espindola
49dba99a89 Produce an error instead of asserting on invalid .sleb128/.uleb128.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233155 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 00:25:37 +00:00
Paul Robinson
759015d80f 'optnone' should not disable DAG combiner.
Reverts the code change from r221168 and the relevant test.
It was a mistake to disable the combiner, and based on the ultimate
definition of 'optnone' we shouldn't have considered the test case
as failing in the first place.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233153 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-25 00:10:24 +00:00
Philip Reames
c50bf46a22 !invariant.load semantics with potentially clobbering calls
A load from an invariant location is assumed to not alias any otherwise potentially aliasing stores. Our implementation only applied this rule to store instructions themselves whereas they it should apply for any memory accessing instruction. This results in both FRE and PRE becoming more effective at eliminating invariant loads.

Note that as a follow on change I will likely move this into AliasAnalysis itself. That's where the TBAA constant flag is handled and the semantics are essentially the same. I'd like to separate the semantic change from the refactoring and thus have extended the hack that's already in MemoryDependenceAnalysis for this change.

Differential Revision: http://reviews.llvm.org/D8591



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:54:54 +00:00
Rafael Espindola
4e86f54fdb Don't be over eager in evaluating a subtraction with a weak symbol.
In a subtraction of the form A - B, if B is weak, there is no way to represent
that on ELF since all relocations add the value of a symbol.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233139 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:48:44 +00:00
Reid Kleckner
d639e1975d X86: Fix frameescape when not using an FP
We can't use TargetFrameLowering::getFrameIndexOffset directly, because
Win64 really wants the offset from the stack pointer at the end of the
prologue. Instead, use X86FrameLowering::getFrameIndexOffsetFromSP(),
which is a pretty close approximiation of that. It fails to handle cases
with interestingly large stack alignments, which is pretty uncommon on
Win64 and is TODO.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233137 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:46:01 +00:00
Justin Bogner
fa71fb34ad Update a test I missed in r233132
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233134 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:44:03 +00:00
Justin Bogner
b9e97c799e llvm-cov: Require a subcommand when invoked as llvm-cov
A while ago llvm-cov gained support for clang's instrumentation based
profiling in addition to its gcov support, and subcommands were added
to choose which behaviour to use. When no subcommand was specified, we
fell back to gcov compatibility with a warning that a subcommand would
be required in the future. Now, we require the subcommand.

Note that if the basename of llvm-cov is gcov (via symlink or
hardlink, for example), we still use the gcov compatible behaviour
with no subcommand required.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233132 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:34:36 +00:00
Frederic Riss
28b71b1917 [dsymutil] Temporarily disable some tests on windows.
It seems one windows bot fails since I added ilne table linking to
llvm-dsymutil (see r232333 commit thread).
Disable the affected tests until I can figure out what's happening.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233130 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 23:11:07 +00:00
Sanjay Patel
14c1d068a3 optimize the AVX2 (integer) version of vperm2 into a shuffle
...because this is what happens when an instruction
set puts its underwear on after its pants.

This is an extension of r232852, r233100, and 233110:
http://llvm.org/viewvc/llvm-project?view=revision&revision=232852
http://llvm.org/viewvc/llvm-project?view=revision&revision=233100
http://llvm.org/viewvc/llvm-project?view=revision&revision=233110



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233127 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 22:39:29 +00:00
Philip Reames
6bde9f6994 Merge empty landing pads in SimplifyCFG
This patch tries to merge duplicate landing pads when they branch to a common shared target.

Given IR that looks like this:
lpad1:
  %exn = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
         cleanup
  br label %shared_resume
lpad2:
  %exn2 = landingpad {i8*, i32} personality i32 (...)* @__gxx_personality_v0
          cleanup
  br label %shared_resume
shared_resume:
  call void @fn()
  ret void
}

We can rewrite the users of both landing pad blocks to use one of them. This will generally allow the shared_resume block to be merged with the common landing pad as well.

Without this change, tail duplication would likely kick in - creating N (2 in this case) copies of the shared_resume basic block.

Differential Revision: http://reviews.llvm.org/D8297



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233125 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 22:28:45 +00:00
Rafael Espindola
54ce82e497 Add -m -m elf_x86_64 to gold invocations.
Otherwise the tests would fail if the default was not elf_x86_64.

This fixes PR22966.

Patch by H.J. Lu!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233124 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 22:20:19 +00:00
David Blaikie
ef9962d9bb Revert "Remove an InstCombine that seems to have become redundant."
Assertion fires in compiler-rt. Guess it does fire..

This reverts commit r233116.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233121 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:50:35 +00:00
Rafael Espindola
45eaa023df Reset the CFA offset at the start of every FDE.
This fixes PR21515.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233120 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:47:31 +00:00
Peter Collingbourne
f99b7d0538 MC: Add more stringent symbol checking to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233118 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:47:00 +00:00
David Blaikie
80da8623a4 Remove an InstCombine that seems to have become redundant.
Assert that this doesn't fire - I'll remove all of this later, but just
leaving it in for a while in case this is firing & we just don't have
test coverage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233116 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 21:31:31 +00:00
Sanjay Patel
5e0ce9d13a [X86, AVX] instcombine vperm2 intrinsics with zero inputs into shuffles
This is the IR optimizer follow-on patch for D8563: the x86 backend patch
that converts this kind of shuffle back into a vperm2.

This is also a continuation of the transform that started in D8486. 
In that patch, Andrea suggested that we could convert vperm2 intrinsics that
use zero masks into a single shuffle. 

This is an implementation of that suggestion.

Differential Revision: http://reviews.llvm.org/D8567



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233110 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 20:36:42 +00:00
Rafael Espindola
71be19dff2 [llvm-readobj] add support for macho universal binary.
Patch by Keyue Hu (Chilledheart)!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233107 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 20:26:55 +00:00
Hans Wennborg
f61cd8b368 Revert r233062 ""float2int": Add a new pass to demote from float to int where possible."
This caused PR23008, compiles failing with: "Use still stuck around after Def is
destroyed: %.sroa.speculated"

Also reverting follow-up r233064.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 20:07:08 +00:00
Sanjoy Das
33a864aae2 [IRCE] Fix a regression introduced in r232444.
IRCE should not try to eliminate range checks that check an induction
variable against a loop-varying length.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233101 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 19:29:18 +00:00
Sanjay Patel
fe76881930 [X86, AVX] recognize shufflevector with zero input as a vperm2 (PR22984)
vperm2x128 instructions have the special ability (aka free hardware capability)
to shuffle zero values into a vector.

This patch recognizes that type of shuffle and generates the appropriate
control byte.

https://llvm.org/bugs/show_bug.cgi?id=22984

Differential Revision: http://reviews.llvm.org/D8563



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233100 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 19:19:07 +00:00
Duncan P. N. Exon Smith
15a873a6e2 Verifier: Start recursing into !dbg attachments
The main verifier already recurses through the other entry points, so we
might as well descend here too.

This temporarily duplicates some work already done in
`verifyDebugInfo()`, but eventually I'll be removing the other side.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233095 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 17:32:19 +00:00
Daniel Sanders
06426f54cb [mips] Support 16-bit offsets for 'm' inline assembly memory constraint.
Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8435

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233086 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 15:19:14 +00:00
Marek Olsak
a2705bbd42 R600/SI: Select V_BFE_U32 for and+shift with a non-literal offset
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 13:40:34 +00:00
Marek Olsak
226f794fba R600/SI: Custom-select 32-bit S_BFE from bitwise opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233078 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 13:40:27 +00:00
Marek Olsak
945fab3447 R600/SI: Improve BFM support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233077 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 13:40:21 +00:00
Marek Olsak
3f05a5e0ad R600/SI: Use V_FRACT_F64 for faster 64-bit floor on SI
Other f64 opcodes not supported on SI can be lowered in a similar way.

v2: use complex VOP3 patterns

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233076 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 13:40:15 +00:00
Marek Olsak
91c066ae15 R600/SI: Expand fract to floor, then only select V_FRACT on CI
V_FRACT is buggy on SI.

R600-specific code is left intact.

v2: drop the multiclass, use complex VOP3 patterns

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233075 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 13:40:08 +00:00
Daniel Sanders
b1058310c1 [mips] Distinguish 'R', 'ZC', and 'm' inline assembly memory constraint.
Summary:
Previous behaviour of 'R' and 'm' has been preserved for now. They will be
improved in subsequent commits.

The offset permitted by ZC varies according to the subtarget since it is
intended to match the restrictions of the pref, ll, and sc instructions.

The restrictions on these instructions are:
* For microMIPS: 12-bit signed offset.
* For Mips32r6/Mips64r6: 9-bit signed offset.
* Otherwise: 16-bit signed offset.

Reviewers: vkalintiris

Reviewed By: vkalintiris

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D8414

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233063 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 11:26:34 +00:00
James Molloy
a54c5b4489 "float2int": Add a new pass to demote from float to int where possible.
It is possible to have code that converts from integer to float, performs operations then converts back, and the result is provably the same as if integers were used.

This can come from different sources, but the most obvious is a helper function that uses floats but the arguments given at an inlined callsites are integers.

This pass considers all integers requiring a bitwidth less than or equal to the bitwidth of the mantissa of a floating point type (23 for floats, 52 for doubles) as exactly representable in floating point.

To reduce the risk of harming efficient code, the pass only attempts to perform complete removal of inttofp/fptoint operations, not just move them around.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233062 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-24 11:15:23 +00:00
Simon Pilgrim
fa17ce8b6e [SelectionDAG] Fixed issue with uitofp vector constant folding being treated as sitofp
While the uitofp scalar constant folding treats an integer as an unsigned value (from lang ref):

%X = sitofp i8 -1 to double ; yields double:-1.0
%Y = uitofp i8 -1 to double ; yields double:255.0

The vector constant folding was always using sitofp:

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>

This patch fixes this so that the correct opcode is used for sitofp and uitofp.

%X = sitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double -1.0, double -1.0>
%Y = uitofp <2 x i8> <i8 -1, i8 -1> to <2 x double> ; yields <double 255.0, double 255.0>

Differential Revision: http://reviews.llvm.org/D8560

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233033 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 22:44:55 +00:00
Duncan P. N. Exon Smith
7ad96398c6 DebugInfo: Overload get() in DIDescriptor subclasses
Continue to simplify the `DIDescriptor` subclasses, so that they behave
more like raw pointers.  Remove `getRaw()`, replace it with an
overloaded `get()`, and overload the arrow and cast operators.  Two
testcases started to crash on the arrow operators with this change
because of `scope:` references that weren't real scopes.  I fixed them.
Soon I'll add verifier checks for them too.

This also adds explicit dereference operators.  Previously, the builtin
dereference against `operator MDNode *()` would have worked, but now the
builtins are ambiguous.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233030 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 21:54:07 +00:00
Rafael Espindola
79cd79b1e6 Refactor how passes get a symbol at the end of a section.
There is now a canonical symbol at the end of a section that different
passes can request.

This also allows us to assert that we don't switch back to a section whose
end symbol has already been printed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233026 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 21:22:04 +00:00
Ahmed Bougacha
c9ad3ab624 [AArch64, ARM] Enable GlobalMerge with -O3 rather than -O1.
The pass used to be enabled by default with CodeGenOpt::Less (-O1).
This is too aggressive, considering the pass indiscriminately merges
all globals together.

Currently, performance doesn't always improve, and, on code that uses
few globals (e.g., the odd file- or function- static), more often than
not is degraded by the optimization.  Lengthy discussion can be found
on llvmdev (AArch64-focused;  ARM has similar problems):
  http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-February/082800.html
Also, it makes tooling and debuggers less useful when dealing with
globals and data sections.

GlobalMerge needs to better identify those cases that benefit, and this
will be done separately.  In the meantime, move the pass to run with
-O3 rather than -O1, on both ARM and AArch64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233024 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 21:17:36 +00:00
Chad Rosier
e07ca14413 [AArch64] Add FileCheck that was missing from test in r232967.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@233013 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 20:25:15 +00:00
Matt Arsenault
59a5e979b5 R600/SI: Allow commuting compares
This enables very common cases to switch to the
smaller encoding.

All of the standard LLVM canonicalizations of comparisons
are the opposite of what we want. Compares with constants
are moved to the RHS, but the first operand can be an inline
immediate, literal constant, or SGPR using the 32-bit VOPC
encoding.

There are additional bad canonicalizations that should
also be fixed, such as canonicalizing ge x, k to gt x, (k + 1)
if this makes k no longer an inline immediate value.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232988 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 18:45:30 +00:00
Chad Rosier
c1813d8fe1 [AArch64] Enable rematerialization of float 0 values.
Patch by Geoff Berry<gberry@codeaurora.org>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232967 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 17:19:34 +00:00
Bradley Smith
a75fecc370 Revert "[ARM] Add more pattern matching for f16 <-> f64 conversions"
This change is incorrect since it converts double rounding into single rounding,
which can produce different results. Instead this optimization will be done by
modifying Clang's codegen to not produce double rounding in the first place.

This reverts commit r232954.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232962 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 16:52:52 +00:00
Tom Stellard
fd58f22744 R600/SI: Fix crash in SIInstrInfo::areLoadsFromSameBasePtr()
This function assumed that SMRD instructions always have immediate
offsets, which is not always the case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232957 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 16:06:01 +00:00
Bradley Smith
de5be4657f [ARM] Add more pattern matching for f16 <-> f64 conversions
Specifically when the conversion is done in two steps, f16 -> f32 -> f64.

For example:

%1 = tail call float @llvm.convert.from.fp16.f32(i16 %0)
%conv = fpext float %1 to double

to:

vcvtb.f64.f16


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232954 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 15:59:54 +00:00
Petar Jovanovic
15863e5e5f Fix sign extension for MIPS64 in makeLibCall function
Fixing sign extension in makeLibCall for MIPS64. In MIPS64 architecture all
32 bit arguments (int, unsigned int, float 32 (soft float)) must be sign
extended. This fixes test "MultiSource/Applications/oggenc/".

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D7791


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232943 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 12:28:13 +00:00
Hal Finkel
e86dbbf058 [SDAG] Don't widen VSETCC during type legalization for split operands
Because the operands of a vector SETCC node can be of a different type from the
result (and often are), it can happen that even if we'd prefer to widen the
result type of the SETCC, the operands have been split instead. In this case,
the SETCC result also must be split. This mirrors what is done in
WidenVecRes_SELECT, and should be NFC elsewhere because if the operands are not
widened the following calls to GetWidenedVector will assert (which is what was
happening in the test case).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232935 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 08:22:43 +00:00
Lang Hames
858c62e51e [Orc] Add missing -use-orcmcjit flag to a number of Orc regression tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232931 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-23 06:02:49 +00:00
Duncan P. N. Exon Smith
031fbaeb29 Prevent CHECK-NOTs from matching file paths
A build directory with a name like `build-Werror` would hit a false
positive on these `CHECK-NOT`s before, since the actual error line looks
like:

    .../build-Werror/bin/llvm-as <stdin>:1:2: error: ...

Switch to using:

    CHECK-NOT: error:

(note the trailing semi-colon) to avoid matching almost any file path.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-22 15:58:21 +00:00
Benjamin Kramer
42a84b54bf [SimplifyLibCalls] Fix negative shifts being produced by the memchr -> bitfield transform.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232903 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 22:04:26 +00:00
Benjamin Kramer
fd48a80e14 [SimplifyLibCalls] Turn memchr(const, C, const) into a bitfield check.
strchr("123!", C) != nullptr is a common pattern to check if C is one
of 1, 2, 3 or !. If the largest element of the string is smaller than
the target's register size we can easily create a bitfield and just
do a simple test for set membership.

int foo(char C) { return strchr("123!", C) != nullptr; } now becomes

	cmpl	$64, %edi ## range check
	sbbb	%al, %al
	movabsq	$0xE000200000001, %rcx
	btq	%rdi, %rcx ## bit test
	sbbb	%cl, %cl
	andb	%al, %cl ## and the two conditions
	andb	$1, %cl
	movzbl	%cl, %eax ## returning an int
	ret

(imho the backend should expand this into a series of branches, but
that's a different story)

The code is currently limited to bit fields that fit in a register, so
usually 64 or 32 bits. Sadly, this misses anything using alpha chars
or {}. This could be fixed by just emitting a i128 bit field, but that
can generate really ugly code so we have to find a better way. To some
degree this is also recreating switch lowering logic, but we can't
simply emit a switch instruction and thus change the CFG within
instcombine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232902 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 21:09:33 +00:00
Matt Arsenault
2bb644f469 R600: Cleanup test with multiple check prefixes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232901 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 19:15:46 +00:00
Benjamin Kramer
4b74df7229 SimplifyLibCalls: Add basic optimization of memchr calls.
This is just memchr(x, y, 0) -> nullptr and constant folding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232896 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 15:36:21 +00:00
Simon Pilgrim
c58f32d981 Tidied up vec_zero_cse.ll test. NFCI.
Added target triple and refactored the CHECKs to be per function.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232894 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 14:05:12 +00:00
David Majnemer
ac5b42cad8 MemoryDependenceAnalysis: Don't miscompile atomics
r216771 introduced a change to MemoryDependenceAnalysis that allowed it
to reason about acquire/release operations.  However, this change does
not ensure that the acquire/release operations pair.  Unfortunately,
this leads to miscompiles as we won't see an acquire load as properly
memory effecting.  This largely reverts r216771.

This fixes PR22708.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232889 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 06:19:17 +00:00
Tim Northover
048ca17f6e AArch64: simplify test case
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232886 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 04:37:08 +00:00
Eric Christopher
ae6fc14d54 Remove the bare getSubtargetImpl call from the AArch64 port. As part
of this add a test that shows we can generate code for functions
that specifically enable a subtarget feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232884 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 04:04:50 +00:00
Eric Christopher
bc473edd7b Remove the bare getSubtargetImpl call from the PPC port. As part
of this add a test that shows we can generate code with
for functions that differ by subtarget feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232882 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 03:36:02 +00:00
Eric Christopher
6f125f52d3 Cache the Function dependent subtarget on the MachineFunction.
As preparation for removing the getSubtargetImpl() call from
TargetMachine go ahead and flip the switch on caching the function
dependent subtarget and remove the bare getSubtargetImpl call
from the X86 port. As part of this add a few tests that show we
can generate code and assemble on X86 based on features/cpu on
the Function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232879 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 03:13:10 +00:00
Kostya Serebryany
a1ea57a185 [sanitizer] experimental tracing for cmp instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232873 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 01:29:36 +00:00
Ahmed Bougacha
995f4f8fd1 [CodeGen][IfCvt] Don't re-ifcvt blocks with unanalyzable terminators.
If we couldn't analyze its terminator (i.e., it's an indirectbr, or some
other weirdness), we can't safely re-if-convert a predicated block,
because we can't tell whether the predicated terminator can
fallthrough (it does).

Currently, we would completely ignore the fallthrough successor. In
the added testcase, this means we used to generate:

    ...
  @ %entry:
    cmp   r5, #21
    ittt  ne
  @ %cc1f:
    cmpne r7, #42
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %cc1t:
    ...

Whereas the successor of %cc1f was originally %bb1.
With the fix, we get the correct:

    ...
  @ %entry:
    cmp   r5, #21
    itt   eq
  @ %cc1t:
    streq.w       r5, [r11]
    moveq pc, r0
  @ %cc1f:
    cmp   r7, #42
    itt   ne
  @ %cc2t:
    strne.w       r5, [r8]
    movne pc, r10
  @ %bb1:
    ...

rdar://20192768
Differential Revision: http://reviews.llvm.org/D8509


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232872 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 01:23:15 +00:00
Ahmed Bougacha
165bd1733b [AArch64] Prefer UZP for concat_vector of illegal truncs.
Follow-up to r232459: prefer a UZP shuffle to the intermediate truncs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232871 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-21 01:08:39 +00:00
Yunzhong Gao
2c11db2e64 Tell lit.cfg about more Windows triples.
For example, the host triple on my 64-bit PC is x86_64-pc-windows-msvc.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232854 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 22:08:40 +00:00
Sanjay Patel
be9ee96926 [X86, AVX] instcombine common cases of vperm2* intrinsics into shuffles
vperm2* intrinsics are just shuffles. 
In a few special cases, they're not even shuffles.

Optimizing intrinsics in InstCombine is better than
handling this in the front-end for at least two reasons:

1. Optimizing custom-written SSE intrinsic code at -O0 makes vector coders
   really angry (and so I have regrets about some patches from last week).

2. Doing mask conversion logic in header files is hard to write and 
   subsequently read.

There are a couple of TODOs in this patch to complete this optimization.

Differential Revision: http://reviews.llvm.org/D8486



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232852 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 21:47:56 +00:00
Andrew Kaylor
e0e1c1d94d Fixing a bug with WinEH PHI handling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232851 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 21:42:54 +00:00
Sanjay Patel
39110ecd35 [X86] Prefer blendps over insertps codegen for one special case
With this patch, for this one exact case, we'll generate:

  blendps %xmm0, %xmm1, $1

instead of:

  insertps %xmm0, %xmm1, $0

If there's a memory operand available for load folding and we're
optimizing for size, we'll still generate the insertps.

The detailed performance data motivation for this may be found in D7866; 
in summary, blendps has 2-3x throughput vs. insertps on widely used chips.

Differential Revision: http://reviews.llvm.org/D8332



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232850 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 21:19:52 +00:00
Rafael Espindola
d80979b25d Don't declare all text sections at the start of the .s
The code this patch removes was there to make sure the text sections went
before the dwarf sections. That is necessary because MachO uses offsets
relative to the start of the file, so adding a section can change relaxations.

The dwarf sections were being printed at the start just to produce symbols
pointing at the start of those sections.

The underlying issue was fixed in r231898. The dwarf sections are now printed
when they are about to be used, which is after we printed the text sections.

To make sure we don't regress, the patch makes the MachO streamer assert
if CodeGen puts anything unexpected after the DWARF sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-03-20 20:00:01 +00:00