Commit Graph

27360 Commits

Author SHA1 Message Date
Ahmed Bougacha
3b9ac8c7c3 [X86] Refactor PMOV[SZ]Xrm to add missing AVX2 patterns.
Most patterns will go away once the extload legalization changes land.

Differential Revision: http://reviews.llvm.org/D6125


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223567 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:31:07 +00:00
Hans Wennborg
a421ac689f SelectionDAG switch lowering: Replace unreachable default with most popular case.
This can significantly reduce the size of the switch, allowing for more
efficient lowering.

I also worked with the idea of exploiting unreachable defaults by
omitting the range check for jump tables, but always ended up with a
non-neglible binary size increase. It might be worth looking into some more.

SimplifyCFG currently does this transformation, but I'm working towards changing
that so we can optimize harder based on unreachable defaults.

Differential Revision: http://reviews.llvm.org/D6510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223566 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:28:50 +00:00
Duncan P. N. Exon Smith
30596886ed IR: Disallow complicated function-local metadata
Disallow complex types of function-local metadata.  The only valid
function-local metadata is an `MDNode` whose sole argument is a
non-metadata function-local value.

Part of PR21532.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223564 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-06 01:26:49 +00:00
Peter Collingbourne
a1d9932027 Add target triples to all dfsan tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223536 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 22:32:30 +00:00
Kuba Brecka
0a12d8211e Recommit of r223513 and r223514.
Reviewed at http://reviews.llvm.org/D6488



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 22:19:18 +00:00
Colin LeMahieu
ec51bc6f3a [Hexagon] Adding sub/and/or reg, imm forms
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223522 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 21:38:29 +00:00
Sanjay Patel
ab4ad4f98e Optimize merging of scalar loads for 32-byte vectors [X86, AVX]
Fix the poor codegen seen in PR21710 ( http://llvm.org/bugs/show_bug.cgi?id=21710 ).
Before we crack 32-byte build vectors into smaller chunks (and then subsequently
glue them back together), we should look for the easy case where we can just load
all elements in a single op.

An example of the codegen change is:

From:

vmovss  16(%rdi), %xmm1
vmovups (%rdi), %xmm0
vinsertps       $16, 20(%rdi), %xmm1, %xmm1
vinsertps       $32, 24(%rdi), %xmm1, %xmm1
vinsertps       $48, 28(%rdi), %xmm1, %xmm1
vinsertf128     $1, %xmm1, %ymm0, %ymm0
retq

To:

vmovups (%rdi), %ymm0
retq

Differential Revision: http://reviews.llvm.org/D6536



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223518 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 21:28:14 +00:00
Colin LeMahieu
5db47f1376 [Hexagon] Updating mux_ir/ri/ii/rr with encoding bits
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223515 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 21:09:27 +00:00
Jan Wen Voung
a44126f432 Use 32-bit ebp for NaCl64 in a limited case: llvm.frameaddress.
Summary:
Follow up to [x32] "Use ebp/esp as frame and stack pointer":
http://reviews.llvm.org/D4617

In that earlier patch, NaCl64 was made to always use rbp.
That's needed for most cases because rbp should hold a full
64-bit address within the NaCl sandbox so that load/stores
off of rbp don't require sandbox adjustment (zeroing the top
32-bits, then filling those by adding r15).

However, llvm.frameaddress returns a pointer and pointers
are 32-bit for NaCl64. In this case, use ebp instead, which
will make the register copy type check. A similar mechanism
may be needed for llvm.eh.return, but is not added in this change.

Test Plan: test/CodeGen/X86/frameaddr.ll

Reviewers: dschuff, nadav

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D6514

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223510 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 20:55:53 +00:00
Bill Seurer
8dcc5c0996 [PowerPC]Update Power VSX test cases to also test fast-isel
Update of some of the VSX test cases for Power to check fast-isel codegen as well as the regular codegen.

http://reviews.llvm.org/D6357


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223509 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 20:32:05 +00:00
Colin LeMahieu
4fda99f866 [Hexagon] Adding tfrih/l instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223506 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 20:07:19 +00:00
Andrea Di Biagio
6a9a49d7ab [X86] Improved lowering of packed vector shifts to vpsllq/vpsrlq.
SSE2/AVX non-constant packed shift instructions only use the lower 64-bit of
the shift count. 

This patch teaches function 'getTargetVShiftNode' how to deal with shifts
where the shift count node is of type MVT::i64.

Before this patch, function 'getTargetVShiftNode' only knew how to deal with
shift count nodes of type MVT::i32. This forced the backend to wrongly
truncate the shift count to MVT::i32, and then zero-extend it back to MVT::i64.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223505 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 20:02:22 +00:00
Colin LeMahieu
189606dbfe [Hexagon] Adding add reg, imm form with encoding bits and test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223504 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 19:51:23 +00:00
Duncan P. N. Exon Smith
1a283409bc BFI: Saturate when combining edges to a successor
When a loop gets bundled up, its outgoing edges are quite large, and can
just barely overflow 64-bits.  If one successor has multiple incoming
edges -- and that successor is getting all the incoming mass --
combining just its edges can overflow.  Handle that by saturating rather
than asserting.

This fixes PR21622.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223500 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 19:13:42 +00:00
Colin LeMahieu
78ec9010c5 [Hexagon] Adding DoubleRegs decoder. Moving C2_mux and A2_nop. Adding combine imm-imm form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223494 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 18:24:06 +00:00
Adrian Prantl
33ada85735 Fix a bug when pretty-printing DW_OP_deref.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223493 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 18:19:38 +00:00
Adrian Prantl
bb3e8dc693 Regenerate this stale testcase from source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223492 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 18:19:32 +00:00
Colin LeMahieu
0785bdf107 [Hexagon] Adding combine reg-reg forms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223485 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 17:38:36 +00:00
Colin LeMahieu
4c58675d35 [Hexagon] Marking several instructions as isCodeGenOnly=0 and adding direct disassembly tests for many instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223482 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 17:27:39 +00:00
Asiri Rathnayake
3ad762170b Improvements to ARM assembler tests
No functional changes. Got myself bitten in r223113 when adding support for
modified immediate syntax (regressions reported by joerg@britannica.bec.de,
fixes in r223366 and r223381). Our assembler tests did not cover serveral
different syntax variants. This patch expands the test coverage to check for
the following cases:

1. Modified immediate operands may be expressed with expressions, as in #(4 * 2)
instead of #8.

2. Modified immediate operands may be _optionally_ prefixed by a '#' symbol or a
'$' symbol.

3. Certain instructions (e.g. ADD) support single input register variants;
[ADD r0, #mod_imm] is same as [ADD r0, r0, #mod_imm].

4. Certain instructions have aliases which convert plain immediates to modified
immediates. For an example, [ADD r0, -10] is not valid because -10 (in two's
complement) cannot be encoded as a modified immediate, but ARMInstrInfo.td
defines an alias which can transform this into a [SUB r0, 10].

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223475 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 16:33:56 +00:00
Evgeniy Stepanov
c4c08aab64 [msan] Avoid extra origin address realignment.
Do not realign origin address if the corresponding application
address is at least 4-byte-aligned.

Saves 2.5% code size in track-origins mode.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223464 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 14:34:03 +00:00
Andrea Di Biagio
54529ed1c4 [X86] Avoid introducing extra shuffles when lowering packed vector shifts.
When lowering a vector shift node, the backend checks if the shift count is a
shuffle with a splat mask. If so, then it introduces an extra dag node to
extract the splat value from the shuffle. The splat value is then used
to generate a shift count of a target specific shift.

However, if we know that the shift count is a splat shuffle, we can use the
splat index 'I' to extract the I-th element from the first shuffle operand.
The advantage is that the splat shuffle may become dead since we no longer
use it.

Example:

;;
define <4 x i32> @example(<4 x i32> %a, <4 x i32> %b) {
  %c = shufflevector <4 x i32> %b, <4 x i32> undef, <4 x i32> zeroinitializer
  %shl = shl <4 x i32> %a, %c
  ret <4 x i32> %shl
}
;;

Before this patch, llc generated the following code (-mattr=+avx):
  vpshufd $0, %xmm1, %xmm1   # xmm1 = xmm1[0,0,0,0]
  vpxor  %xmm2, %xmm2
  vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
  vpslld %xmm1, %xmm0, %xmm0
  retq

With this patch, the redundant splat operation is removed from the code.
  vpxor  %xmm2, %xmm2
  vpblendw $3, %xmm1, %xmm2, %xmm1 # xmm1 = xmm1[0,1],xmm2[2,3,4,5,6,7]
  vpslld %xmm1, %xmm0, %xmm0
  retq


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223461 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 12:13:30 +00:00
Charlie Turner
1610d6e878 Add missing FP build attribute tests.
The test file test/CodeGen/ARM/build-attributes.ll was missing several
floating-point build attribute tests. The intention of this commit is that for
each CPU / architecture currently tested, there are now tests that make sure
the following attributes are sufficiently checked,

  * Tag_ABI_FP_rounding
  * Tag_ABI_FP_denormal
  * Tag_ABI_FP_exceptions
  * Tag_ABI_FP_user_exceptions
  * Tag_ABI_FP_number_model

Also in this commit, the -unsafe-fp-math flag has been augmented with the full
suite of flags Clang sends to LLVM when you pass -ffast-math to Clang. That is,
`-unsafe-fp-math' has been changed to `-enable-unsafe-fp-math -disable-fp-elim
-enable-no-infs-fp-math -enable-no-nans-fp-math -fp-contract=fast'

Change-Id: I35d766076bcbbf09021021c0a534bf8bf9a32dfc

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223454 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 08:22:47 +00:00
Hal Finkel
138d5bf371 Revert "r223440 - Consider subregs when calling MI::registerDefIsDead for phys deps"
Reverting this because, while it fixes the problem in the reduced test case, it
does not fix the problem in the full test case from the bug report.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223442 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 02:07:35 +00:00
Hal Finkel
d281f1443a Consider subregs when calling MI::registerDefIsDead for phys deps
The scheduling dependency graph is built bottom-up within each scheduling
region, and ScheduleDAGInstrs::addPhysRegDeps is called to add output/anti
dependencies, based on physical registers, to the SUs for instructions
based on those that come before them.

In the test case, we start before post-RA scheduling with a block that looks
like this:

...
	INLINEASM <...
andc $0,$0,$2
stdcx. $0,0,$3
bne- 1b
> [sideeffect] [mayload] [maystore] [attdialect], $0:[regdef-ec:G8RC], %X6<earlyclobber,def,dead>, $1:[mem], %X3<kill>, $2:[reguse:G8RC], %X5<kill>, $3:[reguse:G8RC], %X3, $4:[mem], %X3, $5:[clobber], %CC<earlyclobber,imp-def,dead>, <<badref>>
	...
	%X4<def,dead> = ANDIo8 %X4<kill>, 1, %CR0<imp-def,dead>, %CR0GT<imp-def>
	...
	%R29<def> = ISEL %R3<undef>, %R4<kill>, %CR0GT<kill>

where it is relevant that %CC is an alias to %CR0, and that %CR0GT is a
subregister of %CR0. However, for post-RA scheduling, no dependency was added
to prevent the INLINEASM from being scheduled in between the ANDIo8 and the
ISEL (which communicate via the %CR0GT register).

In ScheduleDAGInstrs::addPhysRegDeps, when called for the %CC operand, we'd
iterate over all of its aliases (which include %CC itself and also %CR0), and
look for previously-encountered defs of those registers. We'd find the ANDIo8,
but decide not to add a dependency between the INLINEASM and the ANDIo8 because
both the INLINEASM's def of %CC is dead, and also the ANDIo8 def of %CR0 is
dead. This ignores, however, that ANDIo8 has a non-dead def of %CR0GT, a
subregister of %CR0, and thus a dependency still must exist.

To fix this problem, when calling registerDefIsDead on the SU with the def, we
also check all subregisters for possible non-dead defs, and add the dependency
if any are found.

Fixes PR21742.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223440 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 01:57:22 +00:00
Adrian Prantl
af25a5d9b7 Add a comment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223427 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 01:02:36 +00:00
Rafael Espindola
d3c3235ab3 Add a few extra cases to the test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223417 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-05 00:02:42 +00:00
Kevin Enderby
543abe4906 Re-add support to llvm-objdump for Mach-O universal files and archives with -macho
with fixes.  Includes the move of tests for llvm-objdump for universal files to an X86
directory.  And the fix where it was failing on linux Rafael tracked down with asan.
I had both Jim Grosbach and Adam Hemet look over the second fix since I could not
set up asan to reproduce with the old version but not with the fix.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223416 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 23:56:27 +00:00
Rafael Espindola
fe137022a5 Convert test to use an extra Input file. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223414 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 23:31:21 +00:00
Adrian Prantl
9e8083744d Simplify implementation and testcase of r223401 based on feedback from dblaikie.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223405 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:58:41 +00:00
Adrian Prantl
fb8dcb45c6 Debug info: If the RegisterCoalescer::reMaterializeTrivialDef() is
eliminating all uses of a vreg, update any DBG_VALUE describing that vreg
to point to the rematerialized register instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223401 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:29:04 +00:00
Hans Wennborg
76b1313e75 Add some tests for SimplifyCFG's TurnSwitchRangeIntoICmp(). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223396 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:19:28 +00:00
Hans Wennborg
8de6d3e510 Add some tests for SimplifyCFG's ConstantFoldTerminator(). NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223395 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 22:19:25 +00:00
Weiming Zhao
1b77bb5628 [AArch64] Combining Load and IntToFp should check for neon availability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223382 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 20:25:50 +00:00
Asiri Rathnayake
61f3193001 Fix yet another unseen regression caused by r223113
r223113 added support for ARM modified immediate assembly syntax. Which
assumes all immediate operands are prefixed with a '#'. This assumption
is wrong as per the ARMARM - which recommends that all '#' characters be
treated optional. The current patch fixes this regression and adds a test
case. A follow-up patch will expand the test coverage to other instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223381 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 19:34:59 +00:00
Jonathan Roelofs
3d47855394 Fix thumbv4t indirect calls
So there are a couple of issues with indirect calls on thumbv4t. First, the most
'obvious' instruction, 'blx' isn't available until v5t. And secondly, the
next-most-obvious sequence: 'mov lr, pc; bx rN' doesn't DTRT in thumb code
because the saved off pc has its thumb bit cleared, so when the callee returns
we end up in ARM mode.... yuck.

The solution is to 'bl' to a nearby landing pad with a 'bx rN' in it.

We could cut down on code size by sharing the landing pads between call sites
that are close enough, but for the moment let's do correctness first and look at
performance later.


Patch by: Iain Sandoe

http://reviews.llvm.org/D6519


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223380 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 19:34:50 +00:00
Philip Reames
ef7c2ff0f3 Add a test case for argument type coercion in an invoke of a vararg function
This would have caught the bug I fixed in 223370.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223378 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 19:13:45 +00:00
Hal Finkel
efbb95a1be Revert "r223364 - Revert r223347 which has caused crashes on bootstrap bots."
Reapply r223347, with a fix to not crash on uninserted instructions (or more
precisely, instructions in uninserted blocks). bugpoint was able to reduce the
test case somewhat, but it is still somewhat large (and relies on setting
things up to be simplified during inlining), so I've not included it here.
Nevertheless, it is clear what is going on and why.

Original commit message:

Restrict somewhat the memory-allocation pointer cmp opt from r223093

Based on review comments from Richard Smith, restrict this optimization from
applying to globals that might resolve lazily to other dynamically-loaded
modules, and also from dynamic allocas (which might be transformed into malloc
calls). In short, take extra care that the compared-to pointer is really
simultaneously live with the memory allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223371 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 17:45:19 +00:00
Asiri Rathnayake
9571274787 Fix a minor regression introduced in r223113
r223113 added support for ARM modified immediate assembly syntax. That patch
has broken support for immediate expressions, as in:
    add r0, #(4 * 4)
It wasn't caught because we don't have any tests for this feature. This patch
fixes this regression and adds test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223366 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 14:49:07 +00:00
Alexander Potapenko
182d9aaccb Revert r223347 which has caused crashes on bootstrap bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223364 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 14:22:27 +00:00
Rafael Espindola
7e32ae6cf7 Revert "[Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090>"
This reverts commit r223356.

It was failing check-all (MC/ARM/thumb.s in particular).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223363 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 14:10:20 +00:00
Michael Kuperstein
5e343e6fd0 [X86] Improve a dag-combine that handles a vector extract -> zext sequence.
The current DAG combine turns a sequence of extracts from <4 x i32> followed by zexts into a store followed by scalar loads.
According to measurements by Martin Krastev (see PR 21269) for x86-64, a sequence of an extract, movs and shifts gives better performance. However, for 32-bit x86, the previous sequence still seems better.

Differential Revision: http://reviews.llvm.org/D6501

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223360 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 13:49:51 +00:00
Jyoti Allur
996b683a9f [Thumb/Thumb2] Added restrictions on PC, LR, SP in the register list for PUSH/POP/LDM/STM. <Differential Revision: http://reviews.llvm.org/D6090>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223356 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 11:52:49 +00:00
Patrik Hagglund
cfb121f286 Use DomTree in MachineSink to sink over diamonds.
According to a previous FIXME comment we now not only look at MBB
successors, but also handle code sinking past them:

  x = computation
  if () {} else {}
  use x

The instruction could be sunk over the whole diamond for the
if/then/else (or loop, etc), allowing it to be sunk into other blocks
after that.

Modified test added in r204522, due to one spill less present.

Minor fixes in comments.

Patch provided by Jonas Paulsson. Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223350 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 10:36:42 +00:00
Simon Pilgrim
94590ca4cf [InstCombine] Minor optimization for bswap with binary ops
Added instcombine optimizations for BSWAP with AND/OR/XOR ops:

OP( BSWAP(x), BSWAP(y) ) -> BSWAP( OP(x, y) )
OP( BSWAP(x), CONSTANT ) -> BSWAP( OP(x, BSWAP(CONSTANT) ) )

Since its just a one liner, I've also added BSWAP to the DAGCombiner equivalent as well:

fold (OP (bswap x), (bswap y)) -> (bswap (OP x, y))

Refactored bswap-fold tests to use FileCheck instead of just checking that the bswaps had gone.

Differential Revision: http://reviews.llvm.org/D6407



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223349 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:44:01 +00:00
Elena Demikhovsky
73ae1df82c Masked Load / Store Intrinsics - the CodeGen part.
I'm recommiting the codegen part of the patch.
The vectorizer part will be send to review again.

Masked Vector Load and Store Intrinsics.
Introduced new target-independent intrinsics in order to support masked vector loads and stores. The loop vectorizer optimizes loops containing conditional memory accesses by generating these intrinsics for existing targets AVX2 and AVX-512. The vectorizer asks the target about availability of masked vector loads and stores.
Added SDNodes for masked operations and lowering patterns for X86 code generator.
Examples:
<16 x i32> @llvm.masked.load.v16i32(i8* %addr, <16 x i32> %passthru, i32 4 /* align */, <16 x i1> %mask)
declare void @llvm.masked.store.v8f64(i8* %addr, <8 x double> %value, i32 4, <8 x i1> %mask)

Scalarizer for other targets (not AVX2/AVX-512) will be done in a separate patch.

http://reviews.llvm.org/D6191



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223348 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:40:44 +00:00
Hal Finkel
d70d5148a6 Restrict somewhat the memory-allocation pointer cmp opt from r223093
Based on review comments from Richard Smith, restrict this optimization from
applying to globals that might resolve lazily to other dynamically-loaded
modules, and also from dynamic allocas (which might be transformed into malloc
calls). In short, take extra care that the compared-to pointer is really
simultaneously live with the memory allocation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223347 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:22:28 +00:00
Jean-Daniel Dupas
9ce01153e0 Add missing test file
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223346 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 09:20:13 +00:00
Jean-Daniel Dupas
206b84f324 Add mach-o LC_RPATH support to llvm-objdump
Summary: Add rpath load command support in Mach-O object and update llvm-objdump to use it.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D6512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223343 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 07:37:02 +00:00
Rafael Espindola
4fe0f3fe71 Revert "Add missing test dependency and use a more canonical target name."
This reverts commit r223336.

NAKAMURA Takumi did the same thing in r223332!

Sorry about the noise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@223337 91177308-0d34-0410-b5e6-96231b3b80d8
2014-12-04 04:33:32 +00:00