Commit Graph

26110 Commits

Author SHA1 Message Date
Hal Finkel
1d6c2d717d Add an AlignmentFromAssumptions Pass
This adds a ScalarEvolution-powered transformation that updates load, store and
memory intrinsic pointer alignments based on invariant((a+q) & b == 0)
expressions. Many of the simple cases we can get with ValueTracking, but we
still need something like this for the more complicated cases (such as those
with an offset) that require some algebra. Note that gcc's
__builtin_assume_aligned's optional third argument provides exactly for this
kind of 'misalignment' offset for which this kind of logic is necessary.

The primary motivation is to fixup alignments for vector loads/stores after
vectorization (and unrolling). This pass is added to the optimization pipeline
just after the SLP vectorizer runs (which, admittedly, does not preserve SE,
although I imagine it could).  Regardless, I actually don't think that the
preservation matters too much in this case: SE computes lazily, and this pass
won't issue any SE queries unless there are any assume intrinsics, so there
should be no real additional cost in the common case (SLP does preserve DT and
LoopInfo).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217344 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 20:05:11 +00:00
Hal Finkel
83d886db3a Add additional patterns for @llvm.assume in ValueTracking
This builds on r217342, which added the infrastructure to compute known bits
using assumptions (@llvm.assume calls). That original commit added only a few
patterns (to catch common cases related to determining pointer alignment); this
change adds several other patterns for simple cases.

r217342 contained that, for assume(v & b = a), bits in the mask
that are known to be one, we can propagate known bits from the a to v. It also
had a known-bits transfer for assume(a = b). This patch adds:

assume(~(v & b) = a) : For those bits in the mask that are known to be one, we
                       can propagate inverted known bits from the a to v.

assume(v | b = a) :    For those bits in b that are known to be zero, we can
                       propagate known bits from the a to v.

assume(~(v | b) = a):  For those bits in b that are known to be zero, we can
                       propagate inverted known bits from the a to v.

assume(v ^ b = a) :    For those bits in b that are known to be zero, we can
		       propagate known bits from the a to v. For those bits in
		       b that are known to be one, we can propagate inverted
                       known bits from the a to v.

assume(~(v ^ b) = a) : For those bits in b that are known to be zero, we can
		       propagate inverted known bits from the a to v. For those
		       bits in b that are known to be one, we can propagate
                       known bits from the a to v.

assume(v << c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v << c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >> c = a) :   For those bits in a that are known, we can propagate them
                       to known bits in v shifted to the right by c.

assume(~(v >> c) = a) : For those bits in a that are known, we can propagate
                        them inverted to known bits in v shifted to the right by c.

assume(v >=_s c) where c is non-negative: The sign bit of v is zero

assume(v >_s c) where c is at least -1: The sign bit of v is zero

assume(v <=_s c) where c is negative: The sign bit of v is one

assume(v <_s c) where c is non-positive: The sign bit of v is one

assume(v <=_u c): Transfer the known high zero bits

assume(v <_u c): Transfer the known high zero bits (if c is know to be a power
                 of 2, transfer one more)

A small addition to InstCombine was necessary for some of the test cases. The
problem is that when InstCombine was simplifying and, or, etc. it would fail to
check the 'do I know all of the bits' condition before checking less specific
conditions and would not fully constant-fold the result. I'm not sure how to
trigger this aside from using assumptions, so I've just included the change
here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217343 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 19:21:07 +00:00
Hal Finkel
851b04c920 Make use of @llvm.assume in ValueTracking (computeKnownBits, etc.)
This change, which allows @llvm.assume to be used from within computeKnownBits
(and other associated functions in ValueTracking), adds some (optional)
parameters to computeKnownBits and friends. These functions now (optionally)
take a "context" instruction pointer, an AssumptionTracker pointer, and also a
DomTree pointer, and most of the changes are just to pass this new information
when it is easily available from InstSimplify, InstCombine, etc.

As explained below, the significant conceptual change is that known properties
of a value might depend on the control-flow location of the use (because we
care that the @llvm.assume dominates the use because assumptions have
control-flow dependencies). This means that, when we ask if bits are known in a
value, we might get different answers for different uses.

The significant changes are all in ValueTracking. Two main changes: First, as
with the rest of the code, new parameters need to be passed around. To make
this easier, I grouped them into a structure, and I made internal static
versions of the relevant functions that take this structure as a parameter. The
new code does as you might expect, it looks for @llvm.assume calls that make
use of the value we're trying to learn something about (often indirectly),
attempts to pattern match that expression, and uses the result if successful.
By making use of the AssumptionTracker, the process of finding @llvm.assume
calls is not expensive.

Part of the structure being passed around inside ValueTracking is a set of
already-considered @llvm.assume calls. This is to prevent a query using, for
example, the assume(a == b), to recurse on itself. The context and DT params
are used to find applicable assumptions. An assumption needs to dominate the
context instruction, or come after it deterministically. In this latter case we
only handle the specific case where both the assumption and the context
instruction are in the same block, and we need to exclude assumptions from
being used to simplify their own ephemeral values (those which contribute only
to the assumption) because otherwise the assumption would prove its feeding
comparison trivial and would be removed.

This commit adds the plumbing and the logic for a simple masked-bit propagation
(just enough to write a regression test). Future commits add more patterns
(and, correspondingly, more regression tests).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217342 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 18:57:58 +00:00
David Blaikie
22f8dcb2b5 DebugInfo: Do not use DW_FORM_GNU_addr_index in skeleton CUs, GDB 7.8 errors on this.
It's probably not a huge deal to not do this - if we could, maybe the
address could be reused by a subprogram low_pc and avoid an extra
relocation, but it's just one per CU at best.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217338 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 17:31:42 +00:00
Hal Finkel
3d03d60ca8 Add functions for finding ephemeral values
This adds a set of utility functions for collecting 'ephemeral' values. These
are LLVM IR values that are used only by @llvm.assume intrinsics (directly or
indirectly), and thus will be removed prior to code generation, implying that
they should be considered free for certain purposes (like inlining). The
inliner's cost analysis, and a few other passes, have been updated to account
for ephemeral values using the provided functionality.

This functionality is important for the usability of @llvm.assume, because it
limits the "non-local" side-effects of adding llvm.assume on inlining, loop
unrolling, etc. (these are hints, and do not generate code, so they should not
directly contribute to estimates of execution cost).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217335 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 13:49:57 +00:00
Chandler Carruth
8ceea90956 [x86] Revert my over-eager commit in r217332.
I hadn't actually run all the tests yet and these combines have somewhat
surprisingly far reaching effects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217333 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 12:37:11 +00:00
Chandler Carruth
e328c5ea83 [x86] Tweak the rules surrounding 0,0 and 1,1 v2f64 shuffles and add
support for MOVDDUP which is really important for matrix multiply style
operations that do lots of non-vector-aligned load and splats.

The original motivation was to add support for MOVDDUP as the lack of it
regresses matmul_f64_4x4 by 5% or so. However, all of the rules here
were somewhat suspicious.

First, we should always be using the floating point domain shuffles,
regardless of how many copies we have to make as a movapd is *crazy*
faster than the domain switching cost on some chips. (Mostly because
movapd is crazy cheap.) Because SHUFPD can't do the copy-for-free trick
of the PSHUF instructions, there is no need to avoid canonicalizing on
UNPCK variants, so do that canonicalizing. This also ensures we have the
chance to form MOVDDUP. =]

Second, we assume SSE2 support when doing any vector lowering, and given
that we should just use UNPCKLPD and UNPCKHPD as they can operate on
registers or memory. If vectors get spilled or come from memory at all
this is going to allow the load to be folded into the operation. If we
want to optimize for encoding size (the only difference, and only
a 2 byte difference) it should be done *much* later, likely after RA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217332 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-07 12:02:14 +00:00
Matt Arsenault
6a712e709d R600/SI: Relax a few tests to help enable scheduler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217320 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06 20:44:41 +00:00
Matt Arsenault
360ed46f68 R600/SI: Fix broken check lines.
Fix missing check, and hardcoded register numbers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217318 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06 20:37:56 +00:00
Saleem Abdulrasool
5f61134595 MC: correct DWARF line info for PE/COFF
DWARF address ranges contain a reference to the debug_info section.  This offset
is an absolute relocation except on non-PE/COFF targets where it is section
relative.  We would emit this incorrectly, and trying to map the debug info from
the address would fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217317 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06 19:57:48 +00:00
Chandler Carruth
7cd7154421 [x86] Fix a pretty horrible bug and inconsistency in the x86 asm
parsing (and latent bug in the instruction definitions).

This is effectively a revert of r136287 which tried to address
a specific and narrow case of immediate operands failing to be accepted
by x86 instructions with a pretty heavy hammer: it introduced a new kind
of operand that behaved differently. All of that is removed with this
commit, but the test cases are both preserved and enhanced.

The core problem that r136287 and this commit are trying to handle is
that gas accepts both of the following instructions:

  insertps $192, %xmm0, %xmm1
  insertps $-64, %xmm0, %xmm1

These will encode to the same byte sequence, with the immediate
occupying an 8-bit entry. The first form was fixed by r136287 but that
broke the prior handling of the second form! =[ Ironically, we would
still emit the second form in some cases and then be unable to
re-assemble the output.

The reason why the first instruction failed to be handled is because
prior to r136287 the operands ere marked 'i32i8imm' which forces them to
be sign-extenable. Clearly, that won't work for 192 in a single byte.
However, making thim zero-extended or "unsigned" doesn't really address
the core issue either because it breaks negative immediates. The correct
fix is to make these operands 'i8imm' reflecting that they can be either
signed or unsigned but must be 8-bit immediates. This patch backs out
r136287 and then changes those places as well as some others to use
'i8imm' rather than one of the extended variants.

Naturally, this broke something else. The custom DAG nodes had to be
updated to have a much more accurate type constraint of an i8 node, and
a bunch of Pat immediates needed to be specified as i8 values.

The fallout didn't end there though. We also then ceased to be able to
match the instruction-specific intrinsics to the instructions so
modified. Digging, this is because they too used i32 rather than i8 in
their signature. So I've also switched those intrinsics to i8 arguments
in line with the instructions.

In order to make the intrinsic adjustments of course, I also had to add
auto upgrading for the intrinsics.

I suspect that the intrinsic argument types may have led everything down
this rabbit hole. Pretty happy with the result.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-06 10:00:01 +00:00
Chandler Carruth
469c73bc27 [x86] Fix an embarressing bug in the INSERTPS formation code. The mask
computation was totally wrong, but somehow it didn't really show up with
llc.

I've added an assert that triggers on multiple existing test cases and
updated one of them to show the correct value.

There appear to still be more bugs lurking around insertps's mask. =/
However, note that this only really impacts the new vector shuffle
lowering.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217289 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 23:19:45 +00:00
Akira Hatanaka
cbbae7f41d [inline asm] Add a check in InlineAsm::ConstraintInfo::Parse to make sure '{'
follows '~' in a clobber constraint string.

Previously llc would hit an llvm_unreachable when compiling an inline-asm
instruction with malformed constraint string "~x{21}". This commit enables
LLParser to catch the error earlier and print a more helpful diagnostic.

rdar://problem/14206559


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217288 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 22:30:32 +00:00
Sanjay Patel
52af82df95 Allow vector fsub ops with constants to get the same optimizations as scalars.
This problem is bigger than just fsub, but this is the minimum fix to solve
fneg for PR20556 ( http://llvm.org/bugs/show_bug.cgi?id=20556 ), and we solve
zero subtraction with the same change.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217286 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 22:26:22 +00:00
Rafael Espindola
e8b19acded Fix pr20078.
When linking llvm.global_ctors with the optional third element we have to handle
it specially and only copy the elements whose keys were also copied.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217281 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 21:27:52 +00:00
Bjorn Steinbrink
ea388503c4 Restore the ability to check if LLVMCreateObjectFile was successful
Summary:
Until r216870 LLVMCreateObjectFile returned nullptr in case of an error,
so callers could check if the call was successful. Now, it always
returns an OwningBinary wrapped as an LLVMObjectFileRef, so callers
can't check if the call was successul.

This results in a segfault running e.g.

 llvm-c-test --object-list-sections < /dev/null

So the old behaviour should be restored.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5143

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217279 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 21:22:09 +00:00
Alexey Samsonov
54543afeba [DWARF parser] Fix nasty memory corruption in .dwo files handling.
Forge a test case where llvm-symbolizer has to use external .dwo
file to produce the inlining information.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217270 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 19:29:45 +00:00
Rafael Espindola
1455ac7a30 The gold tests also require ppc to be compiled in.
We could create a tools/gold/PowerPC and a tools/gold/X86, but it doesn't seem
worth it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217267 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 19:01:12 +00:00
Rafael Espindola
eaa85e2027 Revert "Disable the fix for pr20793 because of a gnu ld bug."
This reverts commit r217211.

Both the bfd ld and gold outputs were valid. They were using a Rela relocation,
so the value present in the relocated location was not used, which caused me
to misread the output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217264 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 18:03:38 +00:00
Matt Arsenault
89a7e3ec3e R600/SI: Use same complex patterns for DS atomics
This fixes hitting the same negative base offset problem
that was already fixed for regular loads and stores.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 16:24:58 +00:00
Daniel Sanders
353cf20b9b [mips] Marked the Trap-on-Condition instructions as Mips II
Patch by Vasileios Kalintiris.

Reviewers: dsanders

Reviewed By: dsanders

Differential Revision: http://reviews.llvm.org/D5173


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 15:50:13 +00:00
Jan Vesely
286f644bce R600: Fix FROUND
round halfway cases away from zero

Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <tom@stellard.net>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217250 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 14:26:54 +00:00
Tom Stellard
7cda2d0666 R600/SI: Use S_ADD_U32 and S_SUB_U32 for low half of 64-bit operations
https://bugs.freedesktop.org/show_bug.cgi?id=83416

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217248 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 14:07:59 +00:00
Chandler Carruth
c1c5dcf069 [x86] Factor out the zero vector insertion logic in the new vector
shuffle lowering for integer vectors and share it from v4i32, v8i16, and
v16i8 code paths.

Ironically, the SSE2 v16i8 code for this is now better than the SSSE3!
=] Will have to fix the SSSE3 code next to just using a single pshufb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217240 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 10:36:31 +00:00
Frederic Riss
94f5d4480a [dwarfdump] Dump DW_AT_(decl|call)_line attribute values as decimal values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217232 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 07:21:50 +00:00
Frederic Riss
eeb0520463 Reapply "[dwarfdump] Add missing DW_LANG_Mips_Assembler case to LanguageString()"
This commit was reverted in r217183, but is OK to go in again now that its dependency is commited (as of r217186).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217231 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 07:21:40 +00:00
David Majnemer
163462eec8 InstCombine: Remove a special case pattern
The special case did not work when run under -reassociate and can easily
be expressed by a further generalization of an existing pattern.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217227 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 06:09:24 +00:00
Saleem Abdulrasool
7da07d7834 MC: correct DWARF header for PE/COFF assembly input
The header contains an offset to the DWARF line table for the CU.  The offset
must be section relative for COFF and absolute for others.  The non-assembly
code path for the DWARF header generation already has the correct emission for
the headers.  This corrects the assembly input path.

This was identified by BFD objecting to the LLVM generated DWARF information.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217222 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 04:15:00 +00:00
Jiangning Liu
b20b9bf9fd [AArch64] Add pass to enable additional comparison optimizations by CSE.
Patched by Sergey Dmitrouk.

This pass tries to make consecutive compares of values use same operands to
allow CSE pass to remove duplicated instructions. For this it analyzes
branches and adjusts comparisons with immediate values by converting:

GE -> GT
GT -> GE
LT -> LE
LE -> LT

and adjusting immediate values appropriately. It basically corrects two
immediate values towards each other to make them equal.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217220 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 02:55:24 +00:00
Rafael Espindola
6cf4a0f506 Disable the fix for pr20793 because of a gnu ld bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217211 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-05 00:14:12 +00:00
Rafael Espindola
295a0088db Fix pr20793.
With this patch the third field of llvm.global_ctors is also used on ELF.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217202 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 23:03:58 +00:00
Frederic Riss
a49caa5e3f [ dwarfdump ] Add symbolic dump of known DWARF attribute values.
Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5187

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217186 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 19:39:20 +00:00
Frederic Riss
4b2e523613 Revert "[dwarfdump] Add missing DW_LANG_Mips_Assembler case to LanguageString()"
This reverts commit 93c7e6161e1adbd2c7ac81fa081823183035cb64.

This commit got approved first, but was dependant on another one going in (The one pretty printing attribute values). I'll reapply when the other one is in.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217183 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 18:55:46 +00:00
Frederic Riss
e4e2997f8b [dwarfdump] Add missing DW_LANG_Mips_Assembler case to LanguageString()
Reviewed By: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5193

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217182 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 18:40:23 +00:00
Reid Kleckner
b9cb76d3f3 MC Win64: Put unwind info for COMDAT code into the same COMDAT group
Summary:
This fixes a long standing issue where we would emit many little .text
sections and only one .pdata and .xdata section. Now we generate one
.pdata / .xdata pair per .text section and associate them correctly.

Fixes PR19667.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5181

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217176 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 17:42:03 +00:00
Kevin Enderby
54dfc533fc Removed the ctime printed “time stamp” from macho-private-headers.test to fix the builds.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217175 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 17:13:44 +00:00
Kevin Enderby
37598b62a7 Adds the next bit of support for llvm-objdump’s -private-headers for executable Mach-O files.
This adds the printing of more load commands, so that the normal load commands
in a typical X86 Mach-O executable can all be printed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217172 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 16:54:47 +00:00
Tim Northover
8dcac5d77a AArch64: fix vector-immediate BIC/ORR on big-endian devices.
Follow up to r217138, extending the logic to other NEON-immediate instructions.
As before, the instruction already performs the correct operation and we're
just using a different type for convenience, so we want a true nop-cast.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217159 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 15:05:24 +00:00
Tim Northover
dfe4e3e706 AArch64: fix big-endian immediate materialisation
We were materialising big-endian constants using DAG nodes with types different
from what was requested, followed by a bitcast. This is fine on little-endian
machines where bitcasting is a nop, but we need a slightly different
representation for big-endian. This adds a new set of NVCAST (natural-vector
cast) operations which are always nops.

Patch by Asiri Rathnayake.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217138 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 09:46:14 +00:00
Chandler Carruth
ae98867126 [x86] Teach the new v4i32 shuffle lowering some more tricks to recognize
vzext patterns and insert-element patterns that for SSE4 have dedicated
instructions.

With this we can enable the experimental mode in a regression test that
happens to cover some of the past set of issues. You can see that the
new logic does significantly better here on the floating point cases.

A follow-up to this change and the previous ones will hoist the logic
into helpers so it can be shared across element type sizes as in this
particular case it generalizes cleanly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217136 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 09:26:30 +00:00
Lang Hames
21797d6cd6 [MCJIT] Make sure eh-frame fixups use the target's pointer type, not the host's.
If the wrong pointer type is used it can cause corruption of the frame
description entries.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217124 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 04:53:03 +00:00
Juergen Ributzka
cd72c216cd Revert r216803 "[MachineSinking] Clear kill flag of all operands at all their uses."
This reverts commit r216803, because it might have broken the buildbot.
The issue is tracked in PR20842.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217120 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 02:07:36 +00:00
Juergen Ributzka
68a4ab08b3 [FastISel][AArch64] Add target-specific lowering for logical operations.
This change adds support for immediate and shift-left folding into logical
operations.

This fixes rdar://problem/18223183.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217118 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:29:18 +00:00
Chandler Carruth
fa2dfaedf2 [x86] Teach the new vector shuffle lowering about the zero masking
abilities of INSERTPS which are really powerful and come up in very
important contexts such as forming diagonal matrices, etc.

With this I ended up being able to remove the somewhat weird helper
I added for INSERTPS because we can collapse the entire state to a no-op
mask. Added a bunch of tests for inserting into a zero-ish vector.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217117 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-04 01:13:48 +00:00
Matt Arsenault
c9cc488dfe R600/SI: Try to keep i32 mul on SALU
Also fix bug this exposed where when legalizing an immediate
operand, a v_mov_b32 would be created with a VSrc dest register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217108 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 23:24:35 +00:00
Kostya Serebryany
c9b2548b23 [asan] fix debug info produced for asan-coverage=2
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217106 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 23:24:18 +00:00
David Majnemer
c0f2b8b528 IndVarSimplify: Don't let LFTR compare against a poison value
LinearFunctionTestReplace tries to use the *next* indvar to compare
against when possible.  However, it may be the case that the calculation
for the next indvar has NUW/NSW flags and that it may only be safely
used inside the loop.  Using it in a comparison to calculate the exit
condition could result in observing poison.

This fixes PR20680.

Differential Revision: http://reviews.llvm.org/D5174

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217102 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 23:03:18 +00:00
Chandler Carruth
699fd1909e [x86] Teach the new vector shuffle lowering about the simplest of
'insertps' patterns.

This replaces two shuffles with a single insertps in very common cases.
My next patch will extend this to leverage the zeroing capabilities of
insertps which will allow it to be used in a much wider set of cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217100 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 22:48:34 +00:00
Kostya Serebryany
f12b1d8c7b [asan] add -asan-coverage=3: instrument all blocks and critical edges.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217098 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 22:37:37 +00:00
Robin Morisset
4b2698cf19 Use target-dependent emitLeading/TrailingFence instead of the target-independent insertLeading/TrailingFence (in AtomicExpandPass)
Fixes two latent bugs:
- There was no fence inserted before expanded seq_cst load (unsound on Power)
- There was only a fence release before seq_cst stores (again unsound, in particular on Power)
    It is not even clear if this is correct on ARM swift processors (where release fences are
    DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift
    as it is not clear whether it is incorrect. I would love to get documentation stating
    whether it is correct or not.
These two bugs were not triggered because Power is not (yet) using this pass, and these
behaviours happen to be (mostly?) working on ARM
(although they completely butchered the semantics of the llvm IR).

See:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html
for an example of the problems that can be caused by the second of these bugs.

I couldn't see a way of fixing these in a completely target-independent way without
adding lots of unnecessary fences on ARM, hence the target-dependent parts of this
patch.

This patch implements the new target-dependent parts only for ARM (the default
of not doing anything is enough for AArch64), other architectures will use this
infrastructure in later patches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217076 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 21:01:03 +00:00
Chandler Carruth
36cf5d68be [x86] Add an SSE4.1 mode to this test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217072 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 20:39:06 +00:00
Chandler Carruth
87508f1d87 [x86] Make this test check everything for both SSE2 and AVX1 modes,
using a common 'all' prefix for the common test output.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217063 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 19:39:10 +00:00
Lang Hames
07ad198d6c Add a regression test to sanity check the PBQP allocator.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217057 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 18:04:10 +00:00
Sanjay Patel
b89304eb7c Preserve IR flags (nsw, nuw, exact, fast-math) in SLP vectorizer (PR20802).
The SLP vectorizer should propagate IR-level optimization hints/flags (nsw, nuw, exact, fast-math)
when converting scalar instructions into vectors. But this isn't a simple copy - we need to take
the intersection (the logical 'and') of the sets of flags on the scalars.

The solution is further complicated because we can have non-uniform (non-SIMD) vector ops after:
http://reviews.llvm.org/D4015
http://llvm.org/viewvc/llvm-project?view=revision&revision=211339

The vast majority of changed files are existing tests that were not propagating IR flags, but I've
also added a new test file for focused testing of IR flag possibilities.

Differential Revision: http://reviews.llvm.org/D5172



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217051 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 17:40:30 +00:00
Rafael Espindola
c359193486 Update to not depend on "llvm-objdump -d -symbolize".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217047 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 16:16:02 +00:00
Tom Stellard
ce4caf146f R600/SI: Add a pattern for i64 and in a branch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217041 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 15:22:41 +00:00
Renato Golin
218805d21d Check-label a bit more specific
Sometimes, the .file could be reordered and it'd identify the ldr in the filename as a bad match.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217037 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 13:32:08 +00:00
Alexander Potapenko
fac68d2d70 Fix PR20800: correctly calculate the offset of the subq instruction when generating compact unwind info.
This CL replaces the constant DarwinX86AsmBackend.PushInstrSize with a method
that lets the backend account for different sizes of "push %reg" instruction
sizes.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217020 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 07:11:34 +00:00
Juergen Ributzka
847547086d Reapply r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.""
This reapplies r216805 with a fix to a copy-past error, which resulted in an
incorrect register class.

Original commit message:
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217019 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 07:07:10 +00:00
Juergen Ributzka
dd7a7107c1 [FastISel][AArch64] Add target-dependent instruction selection for Add/Sub.
There is already target-dependent instruction selection support for Adds/Subs to
support compares and the intrinsics with overflow check. This takes advantage of
the existing infrastructure to also support Add/Sub, which allows the folding of
immediates, sign-/zero-extends, and shifts.

This fixes rdar://problem/18207316.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217007 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 01:38:36 +00:00
Nick Kledzik
46d52b1627 Fix test case to match correct llvm-objdump output
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217006 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-03 01:34:58 +00:00
Renato Golin
418103c4d4 Missing test from r216989
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216990 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:46:18 +00:00
Renato Golin
ddcf3bd0a0 Only emit movw on ARMv6T2+
Fix PR18364.

Patch by Dimitry Andric.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216989 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:45:13 +00:00
Juergen Ributzka
79ec2ed417 [FastISel][AArch64] Use the target-dependent selection code for shifts first.
This uses the target-dependent selection code for shifts first, which allows us
to create better code for shifts with immediates and sign-/zero-extend folding.

Vector type are not handled yet and the code falls back to target-independent
instruction selection for these cases.

This fixes rdar://problem/17907920.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216985 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:33:57 +00:00
Sean Silva
73d86aa56a Nuke MCAnalysis.
The code is buggy and barely tested. It is also mostly boilerplate.
(This includes MCObjectDisassembler, which is the interface to that
functionality)

Following an IRC discussion with Jim Grosbach, it seems sensible to just
nuke the whole lot of functionality, and dig it up from VCS if
necessary (I hope not!).

All of this stuff appears to have been added in a huge patch dump (look
at the timeframe surrounding e.g. r182628) where almost every patch
seemed to be untested and not reviewed before being committed.
Post-review responses to the patches were never addressed. I don't think
any of it would have passed pre-commit review.

I doubt anyone is depending on this, since this code appears to be
extremely buggy. In limited testing that Michael Spencer and I did, we
couldn't find a single real-world object file that wouldn't crash the
CFG reconstruction stuff. The symbolizer stuff has O(n^2) behavior and
so is not much use to anyone anyway. It seemed simpler to remove them as
a whole. Most of this code is boilerplate, which is the only way it was
able to scrape by 60% coverage.

HEADSUP: Modules folks, some files I nuked were referenced from
include/llvm/module.modulemap; I just deleted the references. Hopefully
that is the right fix (one was a FIXME though!).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216983 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:32:20 +00:00
Eric Christopher
d5dd8ce2a5 Reinstate "Nuke the old JIT."
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.

This reinstates commits r215111, 215115, 215116, 215117, 215136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216982 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:28:02 +00:00
Robin Morisset
76b55cc4b1 [X86] Allow atomic operations using immediates to avoid using a register
The only valid lowering of atomic stores in the X86 backend was mov from
register to memory. As a result, storing an immediate required a useless copy
of the immediate in a register. Now these can be compiled as a simple mov.

Similarily, adding/and-ing/or-ing/xor-ing an
immediate to an atomic location (but through an atomic_store/atomic_load,
not a fetch_whatever intrinsic) can now make use of an 'add $imm, x(%rip)'
instead of using a register. And the same applies to inc/dec.

This second point matches the first issue identified in
  http://llvm.org/bugs/show_bug.cgi?id=17281

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216980 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 22:16:29 +00:00
Kostya Serebryany
891198b358 [asan] Assign a low branch weight to ASan's slow path, patch by Jonas Wagner. This speeds up asan (at least on SPEC) by 1%-5% or more. Also fix lint in dfsan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216972 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 21:46:51 +00:00
Matt Arsenault
2aab51a118 R600/SI: Relax some ordering in tests.
This will help with enabling misched

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216971 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 21:45:50 +00:00
Hal Finkel
bf301d5670 Add a CFL Alias Analysis implementation
This provides an implementation of CFL alias analysis (including some
supporting data structures). Currently, we don't have any extremely fancy
features, sans some interprocedural analysis (i.e. no field sensitivity, etc.),
and we do best sitting behind BasicAA + TBAA. In such a configuration, we take
~0.6-0.8% of total compile time, and give ~7-8% NoAlias responses to queries
TBAA and BasicAA couldn't answer when bootstrapping LLVM. In testing this on
other projects, we've seen up to 10.5% of queries dropped by BasicAA+TBAA
answered with NoAlias by this algorithm.

Patch by George Burgess IV (with minor modifications by me -- mostly adapting
some BasicAA tests), thanks!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216970 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 21:43:13 +00:00
Yi Jiang
cb2522448c Generate extract for in-tree uses if the use is scalar operand in vectorized instruction. radar://18144665
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216946 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 21:00:39 +00:00
Matt Arsenault
9c21df64a4 R600/SI: Fix hardcoded register numbers in test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216944 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 20:43:07 +00:00
Matt Arsenault
f471c483e6 R600/SI: Add failing testcase.
This is broken when 64-bit add is only partially
moved to the VALU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216933 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 19:12:31 +00:00
Matt Arsenault
f7a3c7e705 Fix interference caused by fmul 2, x -> fadd x, x
If an fmul was introduced by lowering, it wouldn't be folded
into a multiply by a constant since the earlier combine would
have replaced the fmul with the fadd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216932 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 19:02:53 +00:00
Matt Arsenault
0565503d5d Fix crash when looking up the addrspace of GEPs with vector types
Patch by Björn Steinbrink

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216930 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 18:47:54 +00:00
Reid Kleckner
f93099eb1c CodeGen: Handle va_start in the entry block
Also fix a small copy-paste bug in X86ISelLowering where Chain should
have been used in place of DAG.getEntryToken().

Fixes PR20828.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216929 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 18:42:44 +00:00
Andrea Di Biagio
ca514a41ea Revert: [APFloat] Fixed a bug in method 'fusedMultiplyAdd'.
This reverts revision 216913; the new test added at revision 216913
caused regression failures on a couple of buildbots.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216914 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 17:22:49 +00:00
Andrea Di Biagio
0b6ee9fd1c [APFloat] Fixed a bug in method 'fusedMultiplyAdd'.
When folding a fused multiply-add builtin call, make sure that we propagate the
correct result in the case where the addend is zero, and the two other operands
are finite non-zero.

Example:
  define double @test() {
    %1 = call double @llvm.fma.f64(double 7.0, double 8.0, double 0.0)
    ret double %1
  }

Before this patch, the instruction simplifier wrongly folded the builtin call
in function @test to constant 'double 7.0'.
With this patch, method 'fusedMultiplyAdd' correctly evaluates the multiply and
propagates the expected result (i.e. 56.0).

Added test fold-builtin-fma.ll with the reproducible from PR20832 plus extra
test cases to verify the behavior of method 'fusedMultiplyAdd' in the presence
of NaN/Inf operands.

This fixes PR20832.

Differential Revision: http://reviews.llvm.org/D5152


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216913 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 16:44:56 +00:00
David Majnemer
9e1f653e2f LICM: Don't crash when an instruction is used by an unreachable BB
Summary:
BBs might contain non-LCSSA'd values after the LCSSA pass is run if they
are unreachable from the entry block.

Normally, the users of the instruction would be PHIs but the unreachable
BBs have normal users; rewrite their uses to be undef values.

An alternative fix could involve fixing this at LCSSA but that would
require this invariant to hold after subsequent transforms.  If a BB
created an unreachable block, they would be in violation of this.

This fixes PR19798.

Differential Revision: http://reviews.llvm.org/D5146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216911 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 16:22:00 +00:00
Hal Finkel
2633f795c6 Enable splitting indexing from loads with TargetConstants
When I recommitted r208640 (in r216898) I added an exclusion for TargetConstant
offsets, as there is no guarantee that a backend can handle them on generic
ADDs (even if it generates them during address-mode matching) -- and,
specifically, applying this transformation directly with TargetConstants caused
a self-hosting failure on PPC64. Ignoring all TargetConstants, however, is less
than ideal. Instead, for non-opaque constants, we can convert them into regular
constants for use with the generated ADD (or SUB).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216908 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 16:05:23 +00:00
Rafael Espindola
1e556a80ff Replace -use-init-array with -use-ctors.
We have been using .init-array for most systems for quiet some time,
but tools like llc are still defaulting to .ctors because the old
option was never changed.

This patch makes llc default to .init-array and changes the option to
be -use-ctors.

Clang is not affected by this. It has its own fancier logic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216905 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 13:54:53 +00:00
David Xu
4e2b661005 Merge Extend and Shift into a UBFX
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216899 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 09:33:56 +00:00
Hal Finkel
3da41a28a1 Revert "Revert '[DAGCombiner] Split up an indexed load if only the base pointer value is live'"
I reverted r208640 in r209747 because r208640 broke self-hosting on PPC64. The
underlying cause of the failure is that pre-inc loads with increments
represented by ISD::TargetConstants were being transformed into ISD:::ADDs with
ISD::TargetConstant operands. PPC doesn't have a pattern for those, and so they
were selected as invalid r+r adds.

This recommits r208640, rebased and with an exclusion for ISD::TargetConstant
increments. This behavior seems correct, although in the future we might want
to ask the target to split out the indexing that uses ISD::TargetConstants.

Unfortunately, I don't yet have small test case where the relevant invalid
'add' instruction is not itself dead (and thus eliminated by
DeadMachineInstructionElim -- sometimes bugpoint is too good at removing things)

Original commit message (by Adam Nemet):

Right now the load may not get DCE'd because of the side-effect of updating
the base pointer.

This can happen if we lower a read-modify-write of an illegal larger type
(e.g. i48) such that the modification only affects one of the subparts (the
lower i32 part but not the higher i16 part).  See the testcase.

In order to spot the dead load we need to revisit it when SimplifyDemandedBits
decided that the value of the load is masked off.  This is the
CommitTargetLoweringOpt piece.

I checked compile time with ARM64 by sending SPEC bitcode files through llc.
No measurable change.

Fixes <rdar://problem/16031651>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216898 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-02 06:24:04 +00:00
David Majnemer
e64e7c4634 SROA: Don't insert instructions before a PHI
SROA may decide that it needs to insert a bitcast and would set it's
insertion point before a PHI.  This will create an invalid module
right quick.

Instead, choose the first insertion point in the basic block that holds
our PHI.

This fixes PR20822.

Differential Revision: http://reviews.llvm.org/D5141

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216891 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 21:20:14 +00:00
David Majnemer
a0b2af46a5 Revert "Revert two GEP-related InstCombine commits"
This reverts commit r216698 which reverted r216523 and r216598.

We would attempt to perform the transformation even if the match()
failed because, as a side effect, it would set V.  This would trick us
into believing that we correctly found a place to correctly apply the
transform.

An additional test case was added to getelementptr.ll so that we might
not regress in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216890 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 21:10:02 +00:00
Sanjay Patel
73f8eff19e Add a convenience method to copy wrapping, exact, and fast-math flags (NFC).
The loop vectorizer preserves wrapping, exact, and fast-math properties of scalar instructions.
This patch adds a convenience method to make that operation easier because we need to do this
in the loop vectorizer, SLP vectorizer, and possibly other places.

Although this is a 'no functional change' patch, I've added a testcase to verify that the exact
flag is preserved by the loop vectorizer. The wrapping and fast-math flags are already checked
in existing testcases.

Differential Revision: http://reviews.llvm.org/D5138



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216886 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 18:44:57 +00:00
Jingyue Wu
88350bf61d Fix a typo in comments in r216862, NFC
PR20766 -> PR20776. Thanks Roman Divacky for the catch!


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216883 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 14:55:04 +00:00
Tilmann Scheller
4016a9ea4a [ARM] Add Thumb-2 code size optimization regression test for EOR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216881 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 12:59:34 +00:00
Tilmann Scheller
9e6f09d7ce ARM] Add Thumb-2 code size optimization regression test for BIC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216880 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 12:53:29 +00:00
Yuri Gorshenin
861eddb266 [asan-assembly-instrumentation] Prologue and epilogue are moved out from InstrumentMemOperand().
Reviewers: eugenis

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D4923

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216879 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 12:51:00 +00:00
Renato Golin
09e28e39f0 Thumb2 M-class MSR instruction support changes
This patch implements a few changes related to the Thumb2 M-class MSR instruction:
 * better handling of unpredictable encodings,
 * recognition of the _g and _nzcvqg variants by the asm parser only if the DSP
   extension is available, preferred output of MSR APSR moves with the _<bits>
   suffix for v7-M.

Patch by Petr Pavlu.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216874 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 11:25:07 +00:00
Yuri Gorshenin
c642ad9546 Revert "[asan-assembly-instrumentation] Prologue and epilogue are moved out from InstrumentMemOperand()."
This reverts commit 895aa39703.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216872 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 10:24:04 +00:00
Chandler Carruth
1faa445476 Fix a really bad miscompile introduced in r216865 - the else-if logic
chain became completely broken here as *all* intrinsic users ended up
being skipped, and the ones that seemed to be singled out were actually
the exact wrong set.

This is a great example of why long else-if chains can be easily
confusing. Switch the entire code to use early exits and early continues
to have simpler (and more importantly, correct) logic here, as well as
fixing the reversed logic for detecting and continuing on lifetime
intrinsics.

I've also significantly cleaned up the test case and added another test
case demonstrating an example where the optimization is not (trivially)
safe to perform.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216871 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 10:09:18 +00:00
Renato Golin
35b114fe18 Small refactor on VectorizerHint for deduplication
Previously, the hint mechanism relied on clean up passes to remove redundant
metadata, which still showed up if running opt at low levels of optimization.
That also has shown that multiple nodes of the same type, but with different
values could still coexist, even if temporary, and cause confusion if the
next pass got the wrong value.

This patch makes sure that, if metadata already exists in a loop, the hint
mechanism will never append a new node, but always replace the existing one.
It also enhances the algorithm to cope with more metadata types in the future
by just adding a new type, not a lot of code.

Re-applying again due to MSVC 2013 being minimum requirement, and this patch
having C++11 that MSVC 2012 didn't support.

Fixes PR20655.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216870 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 10:00:17 +00:00
Yuri Gorshenin
895aa39703 [asan-assembly-instrumentation] Prologue and epilogue are moved out from InstrumentMemOperand().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216869 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 09:56:45 +00:00
Hal Finkel
32418db285 Feed AA to the inliner and use AA->getModRefBehavior in AddAliasScopeMetadata
This feeds AA through the IFI structure into the inliner so that
AddAliasScopeMetadata can use AA->getModRefBehavior to figure out which
functions only access their arguments (instead of just hard-coding some
knowledge of memory intrinsics). Most of the information is only available from
BasicAA; this is important for preserving alias scoping information for
target-specific intrinsics when doing the noalias parameter attribute to
metadata conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216866 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 09:01:39 +00:00
Nick Lewycky
0cbef49d90 Ignore lifetime intrinsics in use list for MemCpyOptimizer. Patch by Luqman Aden, review by Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216865 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 06:03:11 +00:00
Hal Finkel
5581563457 Fix AddAliasScopeMetadata again - alias.scope must be a complete description
I thought that I had fixed this problem in r216818, but I did not do a very
good job. The underlying issue is that when we add alias.scope metadata we are
asserting that this metadata completely describes the aliasing relationships
within the current aliasing scope domain, and so in the context of translating
noalias argument attributes, the pointers must all be based on noalias
arguments (as underlying objects) and have no other kind of underlying object.
In r216818 excluding appropriate accesses from getting alias.scope metadata is
done by looking for underlying objects that are not identified function-local
objects -- but that's wrong because allocas, etc. are also function-local
objects and we need to explicitly check that all underlying objects are the
noalias arguments for which we're adding metadata aliasing scopes.

This fixes the underlying-object check for adding alias.scope metadata, and
does some refactoring of the related capture-checking eligibility logic (and
adds more comments; hopefully making everything a bit clearer).

Fixes self-hosting on x86_64 with -mllvm -enable-noalias-to-md-conversion (the
feature is still disabled by default).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216863 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 04:26:40 +00:00
Jingyue Wu
d43e6df10b [MachineSink] Use the real post dominator tree
Summary:
Fixes a FIXME in MachineSinking. Instead of using the simple heuristics
in isPostDominatedBy, use the real MachinePostDominatorTree. The old
heuristics caused instructions to sink unnecessarily, and might create
register pressure.

Test Plan:
Added a NVPTX codegen test to verify that our change is in effect. It also
shows the unnecessary register pressure caused by over-sinking. Updated
affected tests in AArch64 and X86.

Reviewers: eliben, meheff, Jiangning

Reviewed By: Jiangning

Subscribers: jholewinski, aemerson, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D4814



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216862 91177308-0d34-0410-b5e6-96231b3b80d8
2014-09-01 03:47:25 +00:00
David Blaikie
e2cf6dc5c9 DebugInfo: Elide lexical scopes which only contain other (inline or lexical) scopes.
DW_TAG_lexical_scopes inform debuggers about the instruction range for
which a given variable (or imported declaration/module/etc) is valid. If
the scope doesn't itself contain any such entities, it's a waste of
space and should be omitted.

We were correctly doing this for entirely empty leaves, but not for
intermediate nodes.

Reduces total (not just debug sections) .o file size for a bootstrap
-gmlt LLVM by 22% and bootstrap -gmlt clang executable by 13%. The wins
for a full -g build will be less as a % (and in absolute terms), but
should still be substantial - with some of that win being fewer
relocations, thus more substantiall reducing link times than fewer bytes
alone would have.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216861 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-31 21:26:22 +00:00
Josh Klontz
6e153ad30a [PATCH][Interpreter] Add missing FP intrinsic lowering.
Summary:
This extends the work done in [1], adding missing intrinsic lowering for floor, trunc, round and copysign.

[1] http://comments.gmane.org/gmane.comp.compilers.llvm.cvs/199372

Test Plan: Extended `test/ExecutionEngine/Interpreter/intrinsics.ll` to test the additional missing intrinsics. All tests pass.

Reviewers: dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D5120

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216827 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 18:33:35 +00:00
Jordan Rose
d421075d0c Teach llvm-bcanalyzer to use one stream's BLOCKINFO to read another stream.
This allows streams that only use BLOCKINFO for debugging purposes to omit
the block entirely. As long as another stream is available with the correct
BLOCKINFO, the first stream can still be analyzed and dumped.

As part of this commit, BitstreamReader gets a move constructor and move
assignment operator, as well as a takeBlockInfo method.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216826 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 17:07:55 +00:00
Hal Finkel
bd896c1b25 Fix AddAliasScopeMetadata to not add scopes when deriving from unknown pointers
The previous implementation of AddAliasScopeMetadata, which adds noalias
metadata to preserve noalias parameter attribute information when inlining had
a flaw: it would add alias.scope metadata to accesses which might have been
derived from pointers other than noalias function parameters. This was
incorrect because even some access known not to alias with all noalias function
parameters could easily alias with an access derived from some other pointer.
Instead, when deriving from some unknown pointer, we cannot add alias.scope
metadata at all. This fixes a miscompile of the test-suite's tramp3d-v4.
Furthermore, we cannot add alias.scope to functions unless we know they
access only argument-derived pointers (currently, we know this only for
memory intrinsics).

Also, we fix a theoretical problem with using the NoCapture attribute to skip
the capture check. This is incorrect (as explained in the comment added), but
would not matter in any code generated by Clang because we get only inferred
nocapture attributes in Clang-generated IR.

This functionality is not yet enabled by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216818 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 12:48:33 +00:00
David Majnemer
c6219bad2e InstCombine: Try harder to combine icmp instructions
consider: (and (icmp X, Y), (and Z, (icmp A, B)))
It may be possible to combine (icmp X, Y) with (icmp A, B).
If we successfully combine, create an 'and' instruction with Z.

This fixes PR20814.

N.B. There is room for improvement after this change but I'm not
convinced it's worth chasing yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216814 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 06:18:20 +00:00
Juergen Ributzka
bcbae3d680 Revert r216805 "[MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR."
I think this broke the build bot. Reverting it for now until I have time to take a closer look.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216813 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 06:16:26 +00:00
Nick Kledzik
aa4d2acf37 Object/llvm-objdump: allow dumping of mach-o exports trie
MachOObjectFile in lib/Object currently has no support for parsing the rebase, 
binding, and export information from the LC_DYLD_INFO load command in final 
linked mach-o images. This patch adds support for parsing the exports trie data
structure. It also adds an option to llvm-objdump to dump that export info.

I did the exports parsing first because it is the hardest. The information is 
encoded in a trie structure, but the standard ObjectFile way to inspect content 
is through iterators. So I needed to make an iterator that would do a 
non-recursive walk through the trie and maintain the concatenation of edges 
needed for the current string prefix.

I plan to add similar support in MachOObjectFile and llvm-objdump to 
parse/display the rebasing and binding info too.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216808 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-30 00:20:14 +00:00
Juergen Ributzka
4e92383b67 [MachineCombiner][AArch64] Use the correct register class for MADD, SUB, and OR.
Select the correct register class for the various instructions that are
generated when combining instructions and constrain the registers to the
appropriate register class.

This fixes rdar://problem/18183707.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216805 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 23:48:09 +00:00
Juergen Ributzka
e7f301e079 [FastISel][AArch64] Use the correct register class for branches.
Also constrain the register class for branches.

This fixes rdar://problem/18181496.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216804 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 23:48:06 +00:00
Juergen Ributzka
bc420a0cc1 [MachineSinking] Clear kill flag of all operands at all their uses.
When sinking an instruction it might be moved past the original last use of one
of its operands. This last use has the kill flag set and the verifier will
obviously complain about this.

Before Machine Sinking (AArch64):
%vreg3<def> = ASRVXr %vreg1, %vreg2<kill>
%XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def>
...

After Machine Sinking:
%XZR<def> = SUBSXrs %vreg4, %vreg1<kill>, 160, %NZCV<imp-def>
...
%vreg3<def> = ASRVXr %vreg1, %vreg2<kill>

This fix clears all the kill flags in all instruction that use the same operands
as the instruction that is being sunk.

This fixes rdar://problem/18180996.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216803 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 23:48:03 +00:00
Adrian Prantl
4b9f3c1665 Debug info: Add a new explicit DIDescriptor flag for the "public" access
specifier and change the default behavior to only emit the
DW_AT_accessibility(public) attribute when the isPublic() is explicitly
set.

rdar://problem/18154959

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216799 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 22:44:07 +00:00
Kevin Enderby
e306159429 Next bit of support for llvm-objdump’s -private-headers for Mach-O files.
This adds the printing of the LC_SEGMENT load command and sections,
LC_SYMTAB and LC_DYSYMTAB load commands.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216795 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 22:30:52 +00:00
Robin Morisset
217b38e19a Fix typos in comments, NFC
Summary: Just fixing comments, no functional change.

Test Plan: N/A

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:53:01 +00:00
Reid Kleckner
9436574d1b musttail: Forward regparms of variadic functions on x86_64
Summary:
If a variadic function body contains a musttail call, then we copy all
of the remaining register parameters into virtual registers in the
function prologue. We track the virtual registers through the function
body, and add them as additional registers to pass to the call. Because
this is all done in virtual registers, the register allocator usually
gives us good code. If the function does a call, however, it will have
to spill and reload all argument registers (ew).

Forwarding regparms on x86_32 is not implemented because most compilers
don't support varargs in 32-bit with regparms.

Reviewers: majnemer

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5060

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216780 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:42:08 +00:00
Reid Kleckner
dae28732f4 Verifier: Don't reject varargs callee cleanup functions
We've rejected these kinds of functions since r28405 in 2006 because
it's impossible to lower the return of a callee cleanup varargs
function. However there are lots of legal ways to leave such a function
without returning, such as aborting. Today we can leave a function with
a musttail call to another function with the correct prototype, and
everything works out.

I'm removing the verifier check declaring that a normal return from such
a function is UB.

Reviewed By: nlewycky

Differential Revision: http://reviews.llvm.org/D5059

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:25:28 +00:00
Louis Gerbarg
6393b3a677 Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values
This patch checks for DAG patterns that are an add or a sub followed by a
compare on 16 and 8 bit inputs. Since AArch64 does not support those types
natively they are legalized into 32 bit values, which means that mask operations
are inserted into the DAG to emulate overflow behaviour. In many cases those
masks do not change the result of the processing and just introduce a dependent
operation, often in the middle of a hot loop.

This patch detects the relevent DAG patterns and then tests to see if the
transforms are equivalent with and without the mask, removing the mask if
possible. The exact mechanism of this patch was discusses in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html

There is a reasonably good chance there are missed oppurtunities due to similiar
(but not identical) DAG patterns that could be funneled into this test, adding
them should be simple if we see test cases.

Tests included.

rdar://13754426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:00:22 +00:00
Reid Kleckner
1469e29334 X86: Fix conflict over ESI between base register and rep;movsl
The new solution is to not use this lowering if there are any dynamic
allocas in the current function. We know up front if there are dynamic
allocas, but we don't know if we'll need to create stack temporaries
with large alignment during lowering. Conservatively assume that we will
need such temporaries.

Reviewed By: hans

Differential Revision: http://reviews.llvm.org/D5128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216775 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 20:50:31 +00:00
Robin Morisset
66b380b6b6 Relax the constraint more in MemoryDependencyAnalysis.cpp
Even loads/stores that have a stronger ordering than monotonic can be safe.
The rule is no release-acquire pair on the path from the QueryInst, assuming that
the QueryInst is not atomic itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 20:32:58 +00:00
Jean-Luc Duprat
1da4862939 Tablegen fixes for new syntax when initializing bits from variables.
Followup to r215086.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216757 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 19:41:04 +00:00
Juergen Ributzka
d8835d09ec [FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc.
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.

This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.

%2 = trunc i32 %1 to i16
... = ... %2                -> ... = ... vreg1 <kill>
... = ... %1                   ... = ... vreg1

This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.

This fixes rdar://problem/18178188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:58:16 +00:00
Tilmann Scheller
59758c4337 [ARM] Add Thumb-2 code size optimization test for ASR (register).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:19:00 +00:00
Tilmann Scheller
b1424d72ca [ARM] Add Thumb-2 code size optimization test for ASR (immediate).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216744 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:02:28 +00:00
Matt Arsenault
0d90962f29 Make fabs safe to speculatively execute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216736 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 16:01:17 +00:00
Matt Arsenault
f4d57e7874 R600/SI: Use mad for fsub + fmul
We can use a negate source modifier to match
this for fsub.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216735 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 16:01:14 +00:00
Tim Northover
1e77dc84c4 AArch64: only try to get operand of a known node.
A bug in r216725 meant we tried to discover the type of a SETCC before
confirming the node actually was a SETCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:34:58 +00:00
Frederic Riss
d1c544306e Remove unnecessary regex in test pattern per dblaikie suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:32:15 +00:00
Jingyue Wu
87a2b36cf6 [NVPTX] Make the alignment an explicit argument to ldu/ldg
Summary:
Instead of specifying the alignment as metadata which may be destroyed by
transformation passes, make the alignment the second argument to ldu/ldg
intrinsic calls.

Test Plan:
ldu-ldg.ll
ldu-i8.ll
ldu-reg-plus-offset.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: meheff, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D5093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216731 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:30:20 +00:00
Tilmann Scheller
c5484a2704 [ARM] Make Thumb-2 code size optimization test more strict.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216729 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:13:35 +00:00
Tilmann Scheller
f238c1844e [ARM] Add a first test for the Thumb-2 code size optimization pass.
While working on a Thumb-2 code size optimization I just realized that we don't have any regression tests for it.

So here's a first test case, I plan to increase the coverage over time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216728 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:04:40 +00:00
Tim Northover
1f70cb9c14 AArch64: skip select/setcc combine in complex case.
In an llvm-stress generated test, we were trying to create a v0iN type and
asserting when that failed. This case could probably be handled by the
function, but not without added complexity and the situation it arises in is
sufficiently odd that there's probably no benefit anyway.

Should fix PR20775.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 13:05:18 +00:00
Frederic Riss
622de3b2eb Use DwarfDebug::attachLowHighPC for the compilation unit DIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 09:00:26 +00:00
Robert Khasanov
37e671e894 [SKX] Enable lowering of integer CMP operations.
Added new types to Legalizer.
Fixed getSetCCResultType function
Added lowering tests.

Reviewed by Elena Demikhovsky.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 08:46:04 +00:00
Job Noorman
d2323fc295 Do not assume the value passed to memset is an i32.
The code in SelectionDAG::getMemset for some reason assumes the value passed to
memset is an i32. This breaks the generated code for targets that only have
registers smaller than 32 bits because the value might get split into multiple
registers by the calling convention. See the test for the MSP430 target included
in the patch for an example.

This patch ensures that nothing is assumed about the type of the value. Instead,
the type is taken from the selected overload of the llvm.memset intrinsic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 08:23:53 +00:00
Jiangning Liu
3cd73a5ded [AArch64] Fix some failures exposed by value type v4f16 and v8f16.
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 01:31:42 +00:00
Alexey Samsonov
79e9b30b11 Introduce -DLLVM_USE_SANITIZER=Undefined CMake option to build UBSan-ified version of LLVM/Clang.
I've fixed most of the simple bugs and currently "check-llvm" test suite
has 26 failures, and "check-clang" suite has 5 failures.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216701 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:50:36 +00:00
Juergen Ributzka
cf45151b2c [FastISel][AArch64] Don't fold instructions that are not in the same basic block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.

Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.

This fixes rdar://problem/18169495.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:19:21 +00:00
David Majnemer
6acfc54706 Revert two GEP-related InstCombine commits
This reverts commit r216523 and r216598; people have reported
regressions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:06:43 +00:00
Reid Kleckner
8f58e02646 Don't promote byval pointer arguments when padding matters
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.

Patch by Thomas Jablin!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 22:42:00 +00:00
Jim Grosbach
0d34b1ed26 AArch64: More correctly constrain target vector extend lowering.
The AArch64 target lowering for [zs]ext of vectors is set up to handle
input simple types and expects the generic SDag path to do something reasonable
with anything that's not a simple type. The code, however, was only
checking that the result type was a simple type and assuming that
implied that the source type would also be a simple type. That's not a
valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>"
demonstrate. The fix is to simply explicitly validate the source type
as well as the result type.

PR20791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 22:08:28 +00:00
Rafael Espindola
4bb535027e On MachO, don't put non-private constants in mergeable sections.
On MachO, putting a symbol that doesn't start with a 'L' or 'l' in one of the
__TEXT,__literal* sections prevents the linker from merging the context of the
section.

Since private GVs are the ones the get mangled to start with 'L' or 'l', we now
only put those on the __TEXT,__literal* sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216682 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 20:13:31 +00:00
Sanjay Patel
cf9661c6f9 Fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472).
Remove a block of code from LowerSIGN_EXTEND_INREG() that was added with:
http://llvm.org/viewvc/llvm-project?view=revision&revision=177421

And caused:
http://llvm.org/bugs/show_bug.cgi?id=20472 (more analysis here)
http://llvm.org/bugs/show_bug.cgi?id=18054

The testcases confirm that we (1) don't remove a zext op that is necessary and (2) generate
a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows 
folding of the load, and so saves 3 bytes overall.

Differential Revision: http://reviews.llvm.org/D4909



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 18:59:22 +00:00
Erik Eckstein
c84ba857ea Fix: SLPVectorizer tried to move an instruction which was replaced by a vector instruction.
For a detailed description of the problem see the comment in the test file.
The problematic moveBefore() calls are not required anymore because the new
scheduling algorithm ensures a correct ordering anyway.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 07:04:02 +00:00
David Xu
5ca793561e Generate CMN when comparing a short int with minus
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216651 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 04:59:53 +00:00
Chandler Carruth
da3a293313 [x86] Clean up some tests to use FileCheck and combine two into a single
file.

Changing code that is covered by these tests is just too hard to debug
currently, and now it will be clear the nature of the changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:41:28 +00:00
David Majnemer
b11fff1d8a InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:34:28 +00:00
Juergen Ributzka
4a76317ebb [FastISel] Undo phi node updates when falling-back to SelectionDAG.
The included test case would fail, because the MI PHI node would have two
operands from the same predecessor.

This problem occurs when a switch instruction couldn't be selected. This happens
always, because there is no default switch support for FastISel to begin with.

The problem was that FastISel would first add the operand to the PHI nodes and
then fall-back to SelectionDAG, which would then in turn add the same operands
to the PHI nodes again.

This fix removes these duplicate PHI node operands by reseting the
PHINodesToUpdate to its original state before FastISel tried to select the
instruction.

This fixes <rdar://problem/18155224>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 02:06:55 +00:00
Juergen Ributzka
d24494d672 [FastISel]
Currently instructions are folded very aggressively for AArch64 into the memory
operation, which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

This fix teaches hasTrivialKill to not only check the LLVM IR that the value has
a single use, but also to check if the register that represents that value has
already been used. This can happen when the instruction with the use was folded
into another instruction (in this particular case a load instruction).

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 00:09:46 +00:00
Juergen Ributzka
a26b1bdcc8 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 23:09:40 +00:00
Juergen Ributzka
1f5263e43f [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216629 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 22:52:33 +00:00
Juergen Ributzka
ccf53013cd [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:38:33 +00:00
Juergen Ributzka
d445e4acdb [FastISel][AArch64] Use the zero register for stores.
Use the zero register directly when possible to avoid an unnecessary register
copy and a wasted register at -O0. This also uses integer stores to store a
positive floating-point zero. This saves us from materializing the positive zero
in a register and then storing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:04:52 +00:00
Reid Kleckner
2ab3b563da X86 MC: Handle instructions like fxsave that match multiple operand sizes
Instructions like 'fxsave' and control flow instructions like 'jne'
match any operand size. The loop I added to the Intel syntax matcher
assumed that using a different size would give a different instruction.
Now it handles the case where we get the same instruction for different
memory operand sizes.

This also allows us to remove the hack we had for unsized absolute
memory operands, because we can successfully match things like 'jnz'
without reporting ambiguity.  Removing this hack uncovered test case
involving 'fadd' that was ambiguous. The memory operand could have been
single or double precision.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:10:38 +00:00
David Majnemer
8ee308f499 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216598 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:37 +00:00
David Majnemer
48164ed24d InstSimplify: Don't simplify gep X, (Y-X) to Y if types differ
It's incorrect to perform this simplification if the types differ.
A bitcast would need to be inserted for this to work.

This fixes PR20771.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:34 +00:00
Nico Weber
1c768816d7 Reland r216439 215441, majnemer has a real fix for PR20771.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216586 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:06:19 +00:00
Nico Weber
b2f71836eb Revert r216439 (and r216441, else the former doesn't revert cleanly).
It caused PR 20771. I'll land a test on the clang side.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:00:13 +00:00
David Majnemer
d8e448bd27 InstSimplify: Compute comparison ranges for left shift instructions
'shl nuw CI, x' produces [CI, CI << CLZ(CI)]
'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative
'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 18:03:46 +00:00
Oliver Stannard
5e487f8dc7 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 16:16:04 +00:00
Michael Zolotukhin
b8c95a89e6 [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 15:01:18 +00:00
Chandler Carruth
7e3dc40fab [x86] Fix a regression introduced with r213897 for 32-bit targets where
we stopped efficiently lowering sextload using the SSE41 instructions
for that operation.

This is a consequence of a bad predicate I used thinking of the memory
access needs. The code actually handles the cases where the predicate
doesn't apply, and handles them much better. =] Simple fix and a test
case added. Fixes PR20767.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216538 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:39:47 +00:00
Chandler Carruth
963a5e6c61 [SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine.
This combine is essentially combining target-specific nodes back into target
independent nodes that it "knows" will be combined yet again by a target
independent DAG combine into a different set of target-independent nodes that
are legal (not custom though!) and thus "ok". This seems... deeply flawed. The
crux of the problem is that we don't combine un-legalized shuffles that are
introduced by legalizing other operations, and thus we don't see a very
profitable combine opportunity. So the backend just forces the input to that
combine to re-appear.

However, for this to work, the conditions detected to re-form the unlegalized
nodes must be *exactly* right. Previously, failing this would have caused poor
code (if you're lucky) or a crasher when we failed to select instructions.
After r215611 we would fall back into the legalizer. In some cases, this just
"fixed" the crasher by produces bad code. But in the test case added it caused
the legalizer and the dag combiner to iterate forever.

The fix is to make the alignment checking in the x86 side of things match the
alignment checking in the generic DAG combine exactly. This isn't really a
satisfying or principled fix, but it at least make the code work as intended.
It also highlights that it would be nice to detect the availability of under
aligned loads for a given type rather than bailing on this optimization. I've
left a FIXME to document this.

Original commit message for r215611 which covers the rest of the chang:
  [SDAG] Fix a case where we would iteratively legalize a node during
  combining by replacing it with something else but not re-process the
  node afterward to remove it.

  In a truly remarkable stroke of bad luck, this would (in the test case
  attached) end up getting some other node combined into it without ever
  getting re-processed. By adding it back on to the worklist, in addition
  to deleting the dead nodes more quickly we also ensure that if it
  *stops* being dead for any reason it makes it back through the
  legalizer. Without this, the test case will end up failing during
  instruction selection due to an and node with a type we don't have an
  instruction pattern for.

It took many million runs of the shuffle fuzz tester to find this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:22:16 +00:00
Robert Khasanov
e79a94a839 [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, added VL multiclass.
Added encoding tests


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 09:34:37 +00:00
Elena Demikhovsky
fe0c6ead85 AVX-512: Added intrinsic for VMOVSS store form with mask.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216530 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 07:38:43 +00:00
David Majnemer
dd5456bd01 InstCombine: Optimize GEP's involving ptrtoint better
We supported transforming:
(gep i8* X, -(ptrtoint Y))

to:
(inttoptr (sub (ptrtoint X), (ptrtoint Y)))

However, this only fired if 'X' had type i8*.  Generalize this to
support various types of different sizes.  This results in much better
CodeGen, especially for pointers to packed structs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216523 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:16:04 +00:00
David Blaikie
ad2a271781 Remove type unit skeletons. GDB no longer needs them & this saves a heap of space.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216521 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 05:04:14 +00:00
Juergen Ributzka
fc03e72b4f [FastISel][AArch64] Fix address simplification.
When a shift with extension or an add with shift and extension cannot be folded
into the memory operation, then the address calculation has to be materialized
separately. While doing so the code forgot to consider a possible sign-/zero-
extension. This fix folds now also the sign-/zero-extension into the add or
shift instruction which is used to materialize the address.

This fixes rdar://problem/18141718.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216511 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:30 +00:00
Juergen Ributzka
836f4bd090 [FastISel][AArch64] Fold Sign-/Zero-Extend into the shift immediate instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:58:26 +00:00
David Blaikie
c04b78627d Fix a couple of debug info test cases to match the metadata schema change in r216239
Found these while testing something else.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 00:04:16 +00:00
Reid Kleckner
3c92309f0d MC: Split the x86 asm matcher implementations by dialect
The existing matcher has lots of AT&T assembly dialect assumptions baked
into it.  In particular, the hack for resolving the size of a memory
operand by appending the four most common suffixes doesn't work at all.
The Intel assembly dialect mnemonic table has ambiguous entries, so we
need to try matching multiple times with different operand sizes, since
that's the only way to choose different instruction variants.

This makes us more compatible with gas's implementation of Intel
assembly syntax.  MSVC assumes you want byte-sized operations for the
instructions that we reject as ambiguous.

Reviewed By: grosbach

Differential Revision: http://reviews.llvm.org/D4747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216481 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 20:32:34 +00:00
Joerg Sonnenberger
1d5cdfd751 Revert r210342 and r210343, add test case for the crasher.
PR 20642.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216475 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 19:06:41 +00:00
Yi Kong
2282afa6cc ARM: Add patterns for dbg
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216451 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 12:47:26 +00:00
David Majnemer
8058ffeb18 InstSimplify: Fold gep X, (sub 0, ptrtoint(X)) to null
Save InstCombine some work if we can perform this fold during
InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216441 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 07:08:03 +00:00
David Majnemer
594e4a1dd3 InstSimplify: Simplify trivial pointer expressions like b + (e - b)
consider:
long long *f(long long *b, long long *e) {
  return b + (e - b);
}

we would lower this to something like:
define i64* @f(i64* %b, i64* %e) {
  %1 = ptrtoint i64* %e to i64
  %2 = ptrtoint i64* %b to i64
  %3 = sub i64 %1, %2
  %4 = ashr exact i64 %3, 3
  %5 = getelementptr inbounds i64* %b, i64 %4
  ret i64* %5
}

This should fold away to just 'e'.

N.B.  This adds m_SpecificInt as a convenient way to match against a
particular 64-bit integer when using LLVM's match interface.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216439 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 05:55:16 +00:00
Reid Kleckner
fad8d818db musttail: Don't eliminate varargs packs if there is a forwarding call
Also clean up and beef up this grep test for the feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216425 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 00:59:51 +00:00
Reid Kleckner
44b3a0b411 Declare that musttail calls in variadic functions forward the ellipsis
Summary:
There is no functionality change here except in the way we assemble and
dump musttail calls in variadic functions. There's really no need to
separate out the bits for musttail and "is forwarding varargs" on call
instructions. A musttail call by definition has to forward the ellipsis
or it would fail verification.

Reviewers: chandlerc, nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D4892

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216423 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-26 00:33:28 +00:00
Reid Kleckner
9d1f8b1b21 ArgPromotion: Don't touch variadic functions
Adding, removing, or changing non-pack parameters can change the ABI
classification of pack parameters. Clang and other frontends encode the
classification in the IR of the call site, but the callee side
determines it dynamically based on the number of registers consumed so
far. Changing the prototype affects the number of registers consumed
would break such code.

Dead argument elimination performs a similar task and already has a
similar check to avoid this problem.

Patch by Thomas Jablin!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216421 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 23:58:48 +00:00
Bruno Cardoso Lopes
ff69509f94 Remove dangling initializers in GlobalDCE
GlobalDCE deletes global vars and updates their initializers to nullptr
while leaving underlying constants to be cleaned up later by its uses.
The clean up may never happen, fix this by forcing it every time it's
safe to destroy constants.

Final patch by Rafael Espindola
http://reviews.llvm.org/D4931

<rdar://problem/17523868>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216390 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 17:51:14 +00:00
Chad Rosier
373fc00835 [AArch32] Add patterns for VCVT{A,N,P,M}.
Patterns for lowering libm calls to VCVT{A,N,P,M} are also included.
Phabricator Revision: http://reviews.llvm.org/D5033

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216388 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 16:56:33 +00:00
Robert Khasanov
cc4b123a47 [SKX] avx512_icmp_packed multiclass extension
Extended avx512_icmp_packed multiclass by masking versions.
Added avx512_icmp_packed_rmb multiclass for embedded broadcast versions.
Added corresponding _vl multiclasses.
Added encoding tests for CPCMP{EQ|GT}* instructions.
Add more fields for X86VectorVTInfo.
Added AVX512VLVectorVTInfo that include X86VectorVTInfo for 512/256/128-bit versions

Differential Revision: http://reviews.llvm.org/D5024


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216383 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 14:49:34 +00:00
Karthik Bhat
e637d65af3 Allow vectorization of division by uniform power of 2.
This patch adds support to recognize division by uniform power of 2 and modifies the cost table to vectorize division by uniform power of 2 whenever possible.
Updates Cost model for Loop and SLP Vectorizer.The cost table is currently only updated for X86 backend.
Thanks to Hal, Andrea, Sanjay for the review. (http://reviews.llvm.org/D4971)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-25 04:56:54 +00:00
David Majnemer
5cbd5a13a4 InstCombine: Properly optimize or'ing bittests together
CFE, with -03, would turn:
bool f(unsigned x) {
  bool a = x & 1;
  bool b = x & 2;
  return a | b;
}

into:
  %1 = lshr i32 %x, 1
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

This sort of thing exposes a nasty pathology in GCC, ICC and LLVM.

Instead, we would rather want:
  %1 = and i32 %x, 3
  %2 = icmp ne i32 %1, 0

Things get a bit more interesting in the following case:
  %1 = lshr i32 %x, %y
  %2 = or i32 %1, %x
  %3 = and i32 %2, 1
  %4 = icmp ne i32 %3, 0

Replacing it with the following sequence is better:
  %1 = shl nuw i32 1, %y
  %2 = or i32 %1, 1
  %3 = and i32 %2, %x
  %4 = icmp ne i32 %3, 0

This sequence is preferable because %1 doesn't involve %x and could
potentially be hoisted out of loops if it is invariant; only perform
this transform in the non-constant case if we know we won't increase
register pressure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216343 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-24 09:10:57 +00:00
Hal Finkel
7ca2a7d742 [PowerPC] Add support for dcbtst and icbt (prefetch)
Adds code generation support for dcbtst (data cache prefetch for write) and
icbt (instruction cache prefetch for read - Book E cores only).

We still end up with a 'cannot select' error for the non-supported prefetch
intrinsic forms. This will be fixed in a later commit.

Fixes PR20692.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216339 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 23:21:04 +00:00
Chad Rosier
8eb867e97d Revert "ARM: improve RTABI 4.2 conformance on Linux"
This reverts commit r215862 due to nightly failures.  Will work on getting a
reduced test case, but I wanted to get our bots green in the meantime.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216325 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 18:29:43 +00:00
Chandler Carruth
bfed08e41f [x86] Start fixing a really subtle and terrible form of miscompile in
these DAG combines.

The DAG auto-CSE thing is truly terrible. Due to it, when RAUW-ing
a node with its operand, you can cause its uses to CSE to itself, which
then causes their uses to become your uses which causes them to be
picked up by the RAUW. For nodes that are determined to be "no-ops",
this is "fine". But if the RAUW is one of several steps to enact
a transformation, this causes the DAG to really silently eat an discard
nodes that you would never expect. It took days for me to actually
pinpoint a test case triggering this and a really frustrating amount of
time to even comprehend the bug because I never even thought about the
ability of RAUW to iteratively consume nodes due to CSE-ing them into
itself.

To fix this, we have to build up a brand-new chain of operations any
time we are combining across (potentially) intervening nodes. But once
the logic is added to do this, another issue surfaces: CombineTo eagerly
deletes the one node combined, *but no others*. This is... really
frustrating. If deleting it makes its operands become dead, those
operand nodes often won't go onto the worklist in the
order you would want -- they're already on it and not near the top. That
means things higher on the worklist will get combined prior to these
dead nodes being GCed out of the worklist, and if the chain is long, the
immediate users won't be enough to re-detect where the root of the chain
is that became single-use again after deleting the dead nodes. The
better way to do this is to never immediately delete nodes, and instead
to just enqueue them so we can recursively delete them. The
combined-from node is typically not on the worklist anyways by virtue of
having been popped off.... But that in turn breaks other tests that
*require* CombineTo to delete unused nodes. :: sigh ::

Fortunately, there is a better way. This whole routine should have been
returning the replacement rather than using CombineTo which is quite
hacky. Switch to that, and all the pieces fall together.

I suspect the same kind of miscompile is possible in the half-shuffle
folding code, and potentially the recursive folding code. I'll be
switching those over to a pattern more like this one for safety's sake
even though I don't immediately have any test cases for them. Note that
the only way I got a test case for this instance was with *heavily* DAG
combined 256-bit shuffle sequences generated by my fuzzer. ;]

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216319 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 10:25:15 +00:00
Alex Lorenz
e62541e6d6 llvm-cov: test: add xfail for the big-endian buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216310 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 00:47:24 +00:00
Nick Lewycky
f591b9c33e Revert r215611 because it caused the infinite loop in bug 20736. There is a reduced testcase in that bug.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216307 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-23 00:45:03 +00:00
Yunzhong Gao
9fe92725af Add a test case for SROA where the store size is bigger than slice size. The
test case was fixed in r216248.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216303 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 23:27:04 +00:00
Rafael Espindola
03d2823e02 Add support for comdats to the gold plugin.
There are two parts to this. First, the plugin needs to tell gold the comdat by
setting comdat_key.

What gets things a bit more complicated is that gold only seems
symbols. In particular, if A is an alias to B, it only sees the symbols
A and B. It can then ask us to keep symbol A but drop symbol B. What
we have to do instead is to create an internal version of B and make A
an alias to that.

At some point some of this logic should be moved to lib/Linker so that
we don't map a Constant to an internal version just to have lib/Linker
map that again to the destination module.

The reason for implementing this in tools/gold for now is simplicity.
With it in place it should be possible to update clang to use comdats
for constructors and destructors on ELF without breaking the LTO
bootstrap. Once that is done I intend to come back and improve the
interface lib/Linker exposes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216302 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 23:26:10 +00:00
Alex Lorenz
6c7a6a1ba2 llvm-cov: add code coverage tool that's based on coverage mapping format and clang's pgo.
This commit expands llvm-cov's functionality by adding support for a new code coverage
tool that uses LLVM's coverage mapping format and clang's instrumentation based profiling.
The gcov compatible tool can be invoked by supplying the 'gcov' command as the first argument,
or by modifying the tool's name to end with 'gcov'.

Differential Revision: http://reviews.llvm.org/D4445


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216300 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 22:56:03 +00:00
Jingyue Wu
8be5600f0a [SROA] Fold a PHI node if all its incoming values are the same
Summary:
Fixes PR20425.

During slice building, if all of the incoming values of a PHI node are the same, replace the PHI node with the common value. This simplification makes alloca's used by PHI nodes easier to promote.

Test Plan: Added three more tests in phi-and-select.ll

Reviewers: nlewycky, eliben, meheff, chandlerc

Reviewed By: chandlerc

Subscribers: zinovy.nis, hfinkel, baldrick, llvm-commits

Differential Revision: http://reviews.llvm.org/D4659

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216299 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 22:45:57 +00:00
Reid Kleckner
d89c0abc07 ARM / x86_64 varargs: Don't save regparms in prologue without va_start
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.

Most of the test suite changes are adding va_start calls to existing
tests to keep things working.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216294 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 21:59:26 +00:00
Kevin Enderby
d9757b42ad Add the start of the support for llvm-objdump’s -private-headers for Mach-O files.
This adds the printing of the mach header. Load command printing will be next.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216285 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 20:35:18 +00:00
Tom Stellard
f50f927d65 R600/SI: Use READ2/WRITE2 instructions for 64-bit mem ops with 32-bit alignment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216279 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 18:49:35 +00:00
Tom Stellard
ec4cb3346d R600/SI: Use a ComplexPattern for DS loads and stores
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216278 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 18:49:33 +00:00
Quentin Colombet
c3f2ad0879 [ARM] Move the implementation of the target hooks related to copy-related
instruction from ARMInstrInfo to ARMBaseInstrInfo.
That way, thumb mode can also benefit from the advanced copy optimization.

<rdar://problem/12702965>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216274 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 18:05:22 +00:00
David Majnemer
5939f08f5e InstCombine: Don't unconditionally preserve 'nuw' when shrinking constants
Consider:
  %add = add nuw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nuw' from the
instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216273 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 17:11:04 +00:00
David Majnemer
0e4fc41b0d InstCombine: sub nsw %x, C -> add nsw %x, -C if C isn't INT_MIN
We can preserve nsw during this transform if -C won't overflow.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216269 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 16:41:23 +00:00
Sasa Stankovic
cc59c3f335 [mips] Don't use odd-numbered float registers for double arguments for fastcc
calling convention if FP is 64-bit and +nooddspreg is used.

Differential Revision: http://reviews.llvm.org/D4981.diff


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216262 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 09:23:22 +00:00
David Majnemer
c86bdc73e8 InstCombine: Don't unconditionally preserve 'nsw' when shrinking constants
Consider:
  %add = add nsw i32 %a, -16777216
  %and = and i32 %add, 255

Regardless of whether or not we demand the sign bit of %add, we cannot
replace -16777216 with 2130706432 without also removing 'nsw' from the
instruction.

This fixes PR20377.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216261 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 07:56:32 +00:00
Erik Eckstein
6ca2d8b7c7 fix: SLPVectorizer crashes for unreachable blocks containing not schedulable instructions.
In unreachable blocks it's legal to have instructions like "%x = op %x".
Such instuctions are not schedulable. Therefore the SLPVectorizer has to check for
unreachable blocks and ignore them.

Fixes bug 20646.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216256 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 01:18:39 +00:00
Peter Collingbourne
f5377021c5 [dfsan] Fix non-determinism bug in non-zero label check annotator.
We now use a std::vector instead of a DenseSet to store the list of
label checks so that we can iterate over it deterministically.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216255 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-22 01:18:18 +00:00