Commit Graph

13708 Commits

Author SHA1 Message Date
Rafael Espindola
688e7b3049 This reverts commit r239529 and r239514.
Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."

The  test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239544 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 17:30:33 +00:00
Reid Kleckner
cd354fa84d Revert "Fix merges of non-zero vector stores"
This reverts commit r239539.

It was causing SDAG assertions while building freetype.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 17:25:24 +00:00
Matt Arsenault
564ff6478c Fix merges of non-zero vector stores
Now actually stores the non-zero constant instead of 0.
I somehow forgot to include this part of r238108.

The test change was just an independent instruction order swap,
so just add another check line to satisfy CHECK-NEXT.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239539 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 16:03:52 +00:00
Tom Stellard
6fcf906bb0 R600/SI: Add -mcpu=bonaire to a test that uses flat address space
Flat instructions don't exist on SI, but there is a bug in the backend that
allows them to be selected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 14:51:46 +00:00
Hao Liu
442f620296 [AArch64] Match interleaved memory accesses into ldN/stN instructions.
Add a pass AArch64InterleavedAccess to identify and match interleaved memory accesses. This pass transforms an interleaved load/store into ldN/stN intrinsic. As Loop Vectorizor disables optimization on interleaved accesses by default, this optimization is also disabled by default. To enable it by "-aarch64-interleaved-access-opt=true"

E.g. Transform an interleaved load (Factor = 2):
       %wide.vec = load <8 x i32>, <8 x i32>* %ptr
       %v0 = shuffle %wide.vec, undef, <0, 2, 4, 6>  ; Extract even elements
       %v1 = shuffle %wide.vec, undef, <1, 3, 5, 7>  ; Extract odd elements
     Into:
       %ld2 = { <4 x i32>, <4 x i32> } call aarch64.neon.ld2(%ptr)
       %v0 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 0
       %v1 = extractelement { <4 x i32>, <4 x i32> } %ld2, i32 1

E.g. Transform an interleaved store (Factor = 2):
       %i.vec = shuffle %v0, %v1, <0, 4, 1, 5, 2, 6, 3, 7>  ; Interleaved vec
       store <8 x i32> %i.vec, <8 x i32>* %ptr
     Into:
       %v0 = shuffle %i.vec, undef, <0, 1, 2, 3>
       %v1 = shuffle %i.vec, undef, <4, 5, 6, 7>
       call void aarch64.neon.st2(%v0, %v1, %ptr)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239514 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 09:05:02 +00:00
Simon Pilgrim
44226ffc19 [X86][SSE] Vectorized i8 and i16 shift operators
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:

1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.

The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.

Tested on SSE2, SSE41 and AVX machines.

Differential Revision: http://reviews.llvm.org/D9474

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239509 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 07:46:37 +00:00
Nemanja Ivanovic
f7d6501d1d LLVM support for vector quad bit permute and gather instructions through builtins
This patch corresponds to review:
http://reviews.llvm.org/D10096

This is the back end portion of the patch related to D10095.
The patch adds the instructions and back end intrinsics for:
vbpermq
vgbbd


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239505 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-11 06:21:25 +00:00
Sanjay Patel
c826b54b52 [x86] Add a reassociation optimization to increase ILP via the MachineCombiner pass
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.

Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.

The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.

This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.

Differential Revision: http://reviews.llvm.org/D10321



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 20:32:21 +00:00
Reid Kleckner
3de99b70aa [WinEH] _except_handlerN uses 0 instead of 1 to indicate catch-all
Our usage of 1 was a holdover from __C_specific_handler.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239482 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 18:14:07 +00:00
Colin LeMahieu
c196bfecd6 [Hexagon] Adding decoders for signed operands and ensuring all signed operand types disassemble correctly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239477 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 16:52:32 +00:00
Igor Laevsky
544d686bc0 [StatepointLowering] Reuse stack slots across basic blocks
During statepoint lowering we can sometimes avoid spilling of the value if we know that it was already spilled for previous statepoint.
We were doing this by checking if incoming statepoint value was lowered into load from stack slot. This was working only in boundaries of one basic block.

But instead of looking at the lowered node we can look directly at the llvm-ir value and if it was gc.relocate (or some simple modification of it) look up stack slot for it's derived pointer and reuse stack slot from it. This allows us to look across basic block boundaries.

Differential Revision: http://reviews.llvm.org/D10251



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239472 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 12:31:53 +00:00
Elena Demikhovsky
189930760d AVX-512: Fixed a bug in comparison of i1 vectors.
cmp eq should give kxnor instruction
cmp neq should give kxor 

https://llvm.org/bugs/show_bug.cgi?id=23631



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 06:49:28 +00:00
Reid Kleckner
839f83e1e3 [WinEH] Call llvm.stackrestore in __except blocks
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239451 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 01:34:54 +00:00
Reid Kleckner
c8e72e9126 [WinEH] Emit .safeseh directives for all 32-bit exception handlers
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239448 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-10 01:02:30 +00:00
NAKAMURA Takumi
b1337aba53 Add explicit -mtriple=arm-unknown to llvm/test/CodeGen/ARM/disable-tail-calls.ll, to satisfy *-win32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239442 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 23:33:25 +00:00
Jingyue Wu
95355e6498 [NVPTX] fix a crash bug in NVPTXFavorNonGenericAddrSpaces
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.

This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.

Test Plan: @nested_const_expr and @rauw in access-non-generic.ll

Reviewers: broune, jholewinski

Reviewed By: broune, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239435 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 21:50:32 +00:00
Reid Kleckner
bdcbc426af [WinEH] Add 32-bit SEH state table emission prototype
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.

The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239433 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 21:42:19 +00:00
Chad Rosier
e2e26b486d [AArch64] Remove an overly conservative check when generating store pairs.
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.

Previously, the read of w1 (see below) prevented the formation of a stp.

        str      w0, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        str     w1, [x2, #4]
        ret

We now generate the following code.

        stp      w0, w1, [x2]
        ldr     w8, [x2, #8]
        add      w0, w8, w1
        ret

All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 20:59:41 +00:00
Akira Hatanaka
0e3246a86f Remove DisableTailCalls from TargetOptions and the code in resetTargetOptions
that was resetting it.

Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis. 
 
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.

Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.

rdar://problem/13752163

Differential Revision: http://reviews.llvm.org/D10099


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 19:07:19 +00:00
Samuel Antao
8e8c4b5615 The constant initialization for globals in NVPTX is generated as an
array of bytes. The generation of this byte arrays was expecting 
the host to be little endian, which prevents big endian hosts to be 
used in the generation of the PTX code. This patch fixes the 
problem by changing the way the bytes are extracted so that it 
works for either little and big endian.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239412 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 16:29:34 +00:00
Matt Arsenault
d2f17c4e2b Implement computeKnownBits for min/max nodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239378 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 00:52:41 +00:00
Jingyue Wu
4e04297ac3 [NVPTX] run SROA after NVPTXFavorNonGenericAddrSpaces
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.

Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.

Reviewers: eliben, jholewinski

Reviewed By: eliben, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D10322

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239368 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-09 00:05:56 +00:00
Simon Pilgrim
8176f933d9 [X86][SSE] Added lzcnt vector tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239333 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 19:58:43 +00:00
Akira Hatanaka
fa6bc2e94d [ARM] Pass a callback to FunctionPass constructors to enable skipping execution
on a per-function basis.

Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.

rdar://problem/20542263

Differential Revision: http://reviews.llvm.org/D8717


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 18:50:43 +00:00
Matthias Braun
b0d6c659b7 X86: Reject register operands with obvious type mismatches.
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.

Related to rdar://21042280

Differential Revision: http://reviews.llvm.org/D10260

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 16:56:23 +00:00
Colin LeMahieu
838271c858 [Hexagon] Adding functionality for searching for compound instruction pairs. Compound instructions reduce slot resource requirements freeing those packet slots up for more instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239307 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 16:34:47 +00:00
Simon Pilgrim
298222a930 [DAGCombiner] Added CTLZ vector constant folding support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 16:19:00 +00:00
Igor Breger
17e24879cb AVX-512: Implemented 256/128bit VALIGND/Q instructions for SKX and KNL
Implemented DAG lowering for all these forms.
Added tests for DAG lowering and encoding.

Differential Revision: http://reviews.llvm.org/D10310

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239300 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 14:03:17 +00:00
Simon Pilgrim
d72b357107 [DAGCombiner] Added CTTZ vector constant folding support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239293 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-08 09:57:09 +00:00
Simon Pilgrim
30d36cc8df [X86] Added tzcnt vector tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239264 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-07 21:01:34 +00:00
Simon Pilgrim
4c4f0921dc [X86] Added BitScanForward/BitScanReverse memory folding + tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239257 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-07 18:34:25 +00:00
Simon Pilgrim
bd795464f4 Fixed line endings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239253 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-07 16:09:48 +00:00
Simon Pilgrim
43421abda8 [DAGCombiner] Added CTPOP vector constant folding support.
Added tests to the existing SSE/AVX test files.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239252 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-07 15:37:14 +00:00
Peter Collingbourne
f5c04a9da7 Revert r238473, "Thumb2: Modify codegen for memcpy intrinsic to prefer LDM/STM."
as it caused miscompilations and assertion failures (PR23768,
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150601/280380.html).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239169 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 18:01:28 +00:00
Fiona Glaser
5a0d6b758c DAGCombiner: don't duplicate (fmul x, c) in visitFNEG if fneg is free
For targets with a free fneg, this fold is always a net loss if it
ends up duplicating the multiply, so definitely avoid it.

This might be true for some targets without a free fneg too, but
I'll leave that for future investigation.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239167 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 17:52:34 +00:00
Alexei Starovoitov
b6ebf678ac [bpf] rename triple names bpf_be -> bpfeb
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239162 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 16:11:14 +00:00
Colin LeMahieu
750b351b76 [Hexagon] Reapply r239097 with tests corrected for shuffling and duplexing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239161 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 16:00:11 +00:00
John Brawn
272d7fdf42 [ARM] Add support for -sp- FPUs and FPU none to TargetParser
These are added mainly for the benefit of clang, but this also means that they
are now allowed in .fpu directives and we emit the correct .fpu directive when
single-precision-only is used.

Differential Revision: http://reviews.llvm.org/D10238


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239151 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 13:31:19 +00:00
Simon Pilgrim
841f3dbae8 [X86][AVX2] Added tests for v32i8 vector shifts
Currently still scalarized, but D9474 should remedy that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239146 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 12:35:36 +00:00
Andrea Di Biagio
406e5ea598 Simplify code; NFC.
Also, moved test cases from CodeGen/X86/fold-buildvector-bug.ll into
CodeGen/X86/buildvec-insertvec.ll and regenerated CHECK lines using
update_llc_test_checks.py.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239142 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 10:29:55 +00:00
Simon Pilgrim
8beac08b74 [X86][SSE] Added tests for i8/i16 vector shifts
Currently still scalarized, but D9474 should remedy that.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239136 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-05 08:24:23 +00:00
Swaroop Sridhar
bb3883dfba Statepoint: Fix handling of Far Immediate calls
gc.statepoint intrinsics with a far immediate call target 
were lowered incorrectly as pc-rel32 calls.

This change fixes the problem, and generates an indirect call 
via a scratch register.

For example: 

Intrinsic:
  %safepoint_token = call i32 (i64, i32, void ()*, i32, i32, ...) @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* inttoptr (i64 140727162896504 to void ()*), i32 0, i32 0, i32 0, i32 0)

Old Incorrect Lowering:
  callq 140727162896504

New Correct Lowering:
  movabsq $140727162896504, %rax 
  callq *%rax

In lowerCallFromStatepoint(), the callee-target was modified and 
represented as a "TargetConstant" node, rather than a "Constant" node.
Undoing this modification enabled LowerCall() to generate the 
correct CALL instruction.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239114 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 23:03:21 +00:00
Charles Davis
3e407efb8b [Target/X86] Don't use callee-saved registers in a Win64 tail call on non-Windows.
Summary:
A small bit that I missed when I updated the X86 backend to account for
the Win64 calling convention on non-Windows. Now we don't use dead
non-volatile registers when emitting a Win64 indirect tail call on
non-Windows.

Should fix PR23710.

Test Plan: Added test for the correct behavior based on the case I posted to PR23710.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D10258

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239111 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 22:50:05 +00:00
Benjamin Kramer
c9f2b5d535 [SDAG switch lowering] Fix switch case -> or merging for 0 and INT_MIN
The big/small ordering here is based on signed values so SmallValue will
be INT_MIN and BigValue 0. This shouldn't be a problem but the code
assumed that BigValue always had more bits set than SmallValue.

We used to just miss the transformation, but a recent refactoring of
mine turned this into an assertion failure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239105 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 22:05:51 +00:00
Colin LeMahieu
3a665e02da Shouldn't be XFAIL'ed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239103 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 21:49:43 +00:00
Colin LeMahieu
4e8f68f245 Revert r239095 incorrect test tree.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239102 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 21:32:42 +00:00
Jingyue Wu
834d242f6a [NVPTX] roll forward r239082
NVPTXISelDAGToDAG translates "addrspacecast to param" to
NVPTX::nvvm_ptr_gen_to_param

Added an llc test in bug21465.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239100 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 21:28:26 +00:00
Colin LeMahieu
60b4c7fc30 [Hexagon] Adding functionality for duplexing. Duplexing is a way to compress commonly used pairs of instructions in order to reduce code size. The test case duplex.ll normally would be 8 bytes, assign register to 0 and jump to link register. After duplexing this is only 4 bytes. This also tests the HexagonMCShuffler code path which is used to make sure duplexed instructions still follow slot requirements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239095 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 21:16:16 +00:00
Jingyue Wu
7c65b1bfb2 Revert r239082
llc crashed for NVPTX backend


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239094 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 21:07:08 +00:00
Ahmed Bougacha
0d9335eda7 [GlobalMerge] Take into account minsize on Global users' parents.
Now that we can look at users, we can trivially do this: when we would
have otherwise disabled GlobalMerge (currently -O<3), we can just run
it for minsize functions, as it's usually a codesize win.

Differential Revision: http://reviews.llvm.org/D10054


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-04 20:39:23 +00:00