Summary:
For the moment, TargetMachine::getTargetTriple() still returns a StringRef.
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: ted, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10362
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239554 91177308-0d34-0410-b5e6-96231b3b80d8
Revert "[AArch64] Match interleaved memory accesses into ldN/stN instructions."
Revert "Fixing MSVC 2013 build error."
The test/CodeGen/AArch64/aarch64-interleaved-accesses.ll test was failing on OS X.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239544 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rengolin
Reviewed By: rengolin
Subscribers: llvm-commits, jfb, rengolin
Differential Revision: http://reviews.llvm.org/D10361
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239538 91177308-0d34-0410-b5e6-96231b3b80d8
This patch ensures that SHL/SRL/SRA shifts for i8 and i16 vectors avoid scalarization. It builds on the existing i8 SHL vectorized implementation of moving the shift bits up to the sign bit position and separating the 4, 2 & 1 bit shifts with several improvements:
1 - SSE41 targets can use (v)pblendvb directly with the sign bit instead of performing a comparison to feed into a VSELECT node.
2 - pre-SSE41 targets were masking + comparing with an 0x80 constant - we avoid this by using the fact that a set sign bit means a negative integer which can be compared against zero to then feed into VSELECT, avoiding the need for a constant mask (zero generation is much cheaper).
3 - SRA i8 needs to be unpacked to the upper byte of a i16 so that the i16 psraw instruction can be correctly used for sign extension - we have to do more work than for SHL/SRL but perf tests indicate that this is still beneficial.
The i16 implementation is similar but simpler than for i8 - we have to do 8, 4, 2 & 1 bit shifts but less shift masking is involved. SSE41 use of (v)pblendvb requires that the i16 shift amount is splatted to both bytes however.
Tested on SSE2, SSE41 and AVX machines.
Differential Revision: http://reviews.llvm.org/D9474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239509 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r239437.
This broke clang-cl self-hosts. We'd end up calling the __imp_ symbol
directly instead of using it to do an indirect function call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239502 91177308-0d34-0410-b5e6-96231b3b80d8
It hasn't been used since r130964.
This also removes MachineModuleInfo::isUsedFunction and
MachineModuleInfo::AnalyzeModule, both of which were only
there to support UsedFunctions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239501 91177308-0d34-0410-b5e6-96231b3b80d8
This is a reimplementation of D9780 at the machine instruction level rather than the DAG.
Use the MachineCombiner pass to reassociate scalar single-precision AVX additions (just a
starting point; see the TODO comments) to increase ILP when it's safe to do so.
The code is closely based on the existing MachineCombiner optimization that is implemented
for AArch64.
This patch should not cause the kind of spilling tragedy that led to the reversion of r236031.
Differential Revision: http://reviews.llvm.org/D10321
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239486 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, ted, jfb, llvm-commits, rengolin, jholewinski
Differential Revision: http://reviews.llvm.org/D10311
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239467 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10307
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239465 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This continues the patch series to eliminate StringRef forms of GNU triples
from the internals of LLVM that began in r239036.
Reviewers: echristo, rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D10243
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239464 91177308-0d34-0410-b5e6-96231b3b80d8
We have to do this manually, the runtime only sets up ebp. Fixes a crash
when returning after catching an exception.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239451 91177308-0d34-0410-b5e6-96231b3b80d8
Use a "safeseh" string attribute to do this. You would think we chould
just accumulate the set of personalities like we do on dwarf, but this
fails to account for the LSDA-loading thunks we use for
__CxxFrameHandler3. Each of those needs to make it into .sxdata as well.
The string attribute seemed like the most straightforward approach.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239448 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We used to assume V->RAUW only modifies the operand list of V's user.
However, if V and V's user are Constants, RAUW may replace and invalidate V's
user entirely.
This patch fixes the above issue by letting the caller replace the
operand instead of calling RAUW on Constants.
Test Plan: @nested_const_expr and @rauw in access-non-generic.ll
Reviewers: broune, jholewinski
Reviewed By: broune, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10345
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239435 91177308-0d34-0410-b5e6-96231b3b80d8
This gets all the handler info through to the asm printer and we can
look at the .xdata tables now. I've convinced one small catch-all test
case to work, but other than that, it would be a stretch to say this is
functional.
The state numbering algorithm avoids doing any scope reconstruction as
we do for C++ to simplify the implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239433 91177308-0d34-0410-b5e6-96231b3b80d8
Store instructions do not modify register values and therefore it's safe
to form a store pair even if the source register has been read in between
the two store instructions.
Previously, the read of w1 (see below) prevented the formation of a stp.
str w0, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
str w1, [x2, #4]
ret
We now generate the following code.
stp w0, w1, [x2]
ldr w8, [x2, #8]
add w0, w8, w1
ret
All correctness tests with -Ofast on A57 with Spec200x and EEMBC pass.
Performance results for SPEC2K were within noise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239432 91177308-0d34-0410-b5e6-96231b3b80d8
that was resetting it.
Remove the uses of DisableTailCalls in subclasses of TargetLowering and use
the value of function attribute "disable-tail-calls" instead. Also,
unconditionally add pass TailCallElim to the pipeline and check the function
attribute at the start of runOnFunction to disable the pass on a per-function
basis.
This is part of the work to remove TargetMachine::resetTargetOptions, and since
DisableTailCalls was the last non-fast-math option that was being reset in that
function, we should be able to remove the function entirely after the work to
propagate IR-level fast-math flags to DAG nodes is completed.
Out-of-tree users should remove the uses of DisableTailCalls and make changes
to attach attribute "disable-tail-calls"="true" or "false" to the functions in
the IR.
rdar://problem/13752163
Differential Revision: http://reviews.llvm.org/D10099
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239427 91177308-0d34-0410-b5e6-96231b3b80d8
array of bytes. The generation of this byte arrays was expecting
the host to be little endian, which prevents big endian hosts to be
used in the generation of the PTX code. This patch fixes the
problem by changing the way the bytes are extracted so that it
works for either little and big endian.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239412 91177308-0d34-0410-b5e6-96231b3b80d8
Specified the llvm namespace for the 2 calls to make_unique() which caused
compilation errors in Visual Studio 2013.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239405 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For some branches, GAS accepts an immediate instead of the 2nd register operand.
We only implement this for BNE and BEQ for now. Other branch instructions can be added later, if needed.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: seanbruno, emaste, llvm-commits
Differential Revision: http://reviews.llvm.org/D9666
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239396 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.
Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.
Reviewers: eliben, jholewinski
Reviewed By: eliben, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239368 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.
Reviewers: pete, ributzka, uweigand, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, ted, llvm-commits
Differential Revision: http://reviews.llvm.org/D10174
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239336 91177308-0d34-0410-b5e6-96231b3b80d8
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239302 91177308-0d34-0410-b5e6-96231b3b80d8