The costs are overfitted so that I can still use the legalization factor.
For example the following kernel has about half the throughput vectorized than
unvectorized when compiled with SSE2. Before this patch we would vectorize it.
unsigned short A[1024];
double B[1024];
void f() {
int i;
for (i = 0; i < 1024; ++i) {
B[i] = (double) A[i];
}
}
radar://13599001
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179033 91177308-0d34-0410-b5e6-96231b3b80d8
PowerPC has a conditional branch to the link register (return) instruction: BCLR.
This should be used any time when we'd otherwise have a conditional branch to a
return. This adds a small pass, PPCEarlyReturn, which runs just prior to the
branch selection pass (and, importantly, after block placement) to generate
these conditional returns when possible. It will also eliminate unconditional
branches to returns (these happen rarely; most of the time these have already
been tail duplicated by the time PPCEarlyReturn is invoked). This is a nice
optimization for small functions that do not maintain a stack frame.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179026 91177308-0d34-0410-b5e6-96231b3b80d8
I've managed to convince myself that AArch64's acquire/release
instructions are sufficient to guarantee C++11's required semantics,
even in the sequentially-consistent case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179005 91177308-0d34-0410-b5e6-96231b3b80d8
I couldn't touch this file and not clean it up some. These reformattings
brought to you by clang-format, with some minor adjustments by me. More
spring cleaning to follow here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179004 91177308-0d34-0410-b5e6-96231b3b80d8
internal linkage and so wasn't a patent bug, it doesn't make any sense
here. We can avoid even calling operator<< by just embedding the newline
in the string literals that were already being streamed out. It also
gives the impression of some line-ending agnosticisms which is not
present, and that flushing happens when it doesn't.
If we want to use std::endl, we could do that, but honestly it doesn't
seem remotely worth it. Using '\n' directly is much more clear when
working with raw_ostream.
It also happens to fix builds with old crufty GCC STL implementations
that include std::endl into the global namespace (or headers written to
be compatible with such atrocities).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179003 91177308-0d34-0410-b5e6-96231b3b80d8
First, we should not cheat: fsel-based lowering of select_cc is a
finite-math-only optimization (the ISA manual, section F.3 of v2.06, makes
this clear, as does a note in our own README).
This also adds fsel-based lowering of EQ and NE condition codes. As it turned
out, fsel generation was covered by a grand total of zero regression test
cases. I've added some test cases to cover the existing behavior (which is now
finite-math only), as well as the new EQ cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179000 91177308-0d34-0410-b5e6-96231b3b80d8
The code in getTypeConversion attempts to promote the element vector type
before it trys to split or widen the vector.
After it failed finding a legal vector type by promoting it would continue using
the promoted vector element type. Thereby missing legal splitted vector types.
For example the type v32i32 that has a legal split of 4 x v3i32 on x86/sse2
would be transformed to: v32i256 and from there on successively split to:
v16i256, v8i256, v1i256 and then finally ends up as an i64 type.
By resetting the vector element type to the original vector element type that
existed before the promotion the code will attempt to split the vector type to
smaller vector widths of the same type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178999 91177308-0d34-0410-b5e6-96231b3b80d8
These were the last missing forwarding functions. Also consistently use
the forwarding functions instead of using MachOObj directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178992 91177308-0d34-0410-b5e6-96231b3b80d8
LoadCommandInfo was needed to keep a command and its offset in the file. Now
that we always have a pointer to the command, we don't need the offset.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178991 91177308-0d34-0410-b5e6-96231b3b80d8
The fix for PR14972 in r177055 introduced a real think-o in the *store*
side, likely because I was much more focused on the load side. While we
can arbitrarily widen (or narrow) a loaded value, we can't arbitrarily
widen a value to be stored, as that changes the width of memory access!
Lock down the code path in the store rewriting which would do this to
only handle the intended circumstance.
All of the existing tests continue to pass, and I've added a test from
the PR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178974 91177308-0d34-0410-b5e6-96231b3b80d8
a relocation across sections. Do this for DW_AT_stmt list in the
skeleton CU and check the relocations in the debug_info section.
Add a FIXME for multiple CUs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@178969 91177308-0d34-0410-b5e6-96231b3b80d8