Summary:
This cleans up most allocas NVPTXLowerKernelArgs emits for byval
parameters.
Test Plan: makes bug21465.ll more stronger to verify no redundant local load/store.
Reviewers: eliben, jholewinski
Reviewed By: eliben, jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239368 91177308-0d34-0410-b5e6-96231b3b80d8
We don't want to replace function A by Function B in one module and Function B
by Function A in another module.
If these functions are marked with linkonce_odr we would end up with a function
stub calling B in one module and a function stub calling A in another module. If
the linker decides to pick these two we will have two stubs calling each other.
rdar://21265586
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239367 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes most of the test suite on Windows with clang-cl.
I'm not sure why the test suite was passing with MSVC 2013. Maybe they
changed their behavior and we are emulating their old sign extension
behavior. I think this deserves more investigation, but I want to green
the bot first.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239357 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This was a longstanding FIXME and is a necessary precursor to cases
where foldOperandImpl may have to create more than one instruction
(e.g. to constrain a register class). This is the split out NFC changes from
D6262.
Reviewers: pete, ributzka, uweigand, mcrosier
Reviewed By: mcrosier
Subscribers: mcrosier, ted, llvm-commits
Differential Revision: http://reviews.llvm.org/D10174
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239336 91177308-0d34-0410-b5e6-96231b3b80d8
on a per-function basis.
Previously some of the passes were conditionally added to ARM's pass pipeline
based on the target machine's subtarget. This patch makes changes to add those
passes unconditionally and execute them conditonally based on the predicate
functor passed to the pass constructors. This enables running different sets of
passes for different functions in the module.
rdar://problem/20542263
Differential Revision: http://reviews.llvm.org/D8717
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239325 91177308-0d34-0410-b5e6-96231b3b80d8
The Fragment and Section, and a bool for HasFragment were all used to create
a PointerUnion. Just use a pointer union instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239324 91177308-0d34-0410-b5e6-96231b3b80d8
All of ELF, COFF and MachO now manipulate the flags in helpers so we don't need
anyone to read the flags directly, but instead via those helpers.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239317 91177308-0d34-0410-b5e6-96231b3b80d8
Also delete the now unused MCMachOSymbolFlags.h header as the only enum in there was moved to MCSymbolMachO.
Similarly to ELF and COFF, manipulating the flags is now done via helpers instead of spread
throughout the codebase.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239316 91177308-0d34-0410-b5e6-96231b3b80d8
All flags setting/getting is now done in the class with helper methods instead
of users having to get the bits in the correct order.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239314 91177308-0d34-0410-b5e6-96231b3b80d8
The flags field in MCSymbol only needs to be 16-bits on ELF and MachO.
This moves the 16-bit Type out of there so that it can be reduced in size in a future commit.
Reviewed by Rafael Espíndola.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239313 91177308-0d34-0410-b5e6-96231b3b80d8
While we have some code to transform specification like {ax} into
{eax}/{rax} if the operand type isn't 16bit, we should reject cases
where there is no sane way to do this, like the i128 type in the
example.
Related to rdar://21042280
Differential Revision: http://reviews.llvm.org/D10260
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239309 91177308-0d34-0410-b5e6-96231b3b80d8
The global-merge pass was crashing because it assumes that all ConstantExprs
(reached via the global variables that they use) have at least one user.
I haven't worked out a way to test this, as an unused ConstantExpr cannot be
represented by serialised IR, and global-merge can only be run in llc, which
does not run any passes which can make a ConstantExpr dead.
This (reduced to the point of silliness) C code triggers this bug when compiled
for arm-none-eabi at -O1:
static a = 7;
static volatile b[10] = {&a};
c;
main() {
c = 0;
for (; c < 10;)
printf(b[c]);
}
Differential Revision: http://reviews.llvm.org/D10314
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239308 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for system register MMFR4_EL1 (memory model feature register) in the assembler.
This register provides information about the implemented memory model and memory management support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239302 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We need to add a runtime memcheck for pair of accesses (x,y) where at least one of x and y
are writes.
Assuming we have w writes and r reads, currently this number is estimated as being
w* (w+r-1). This estimation will count (write,write) pairs twice and will overestimate
the number of checks required.
This change adds a getNumberOfChecks method to RuntimePointerCheck, which
will count the number of runtime checks needed (similar in implementation to
needsAnyChecking) and uses it to produce the correct number of runtime checks.
Test Plan:
llvm test suite
spec2k
spec2k6
Performance results: no changes observed (not surprising since the formula for 1 writer is basically the same, which would covers most cases - at least with the current check limit).
Reviewers: anemet
Reviewed By: anemet
Subscribers: mzolotukhin, llvm-commits
Differential Revision: http://reviews.llvm.org/D10217
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239295 91177308-0d34-0410-b5e6-96231b3b80d8
Interleaved memory accesses are grouped and vectorized into vector load/store and shufflevector.
E.g. for (i = 0; i < N; i+=2) {
a = A[i]; // load of even element
b = A[i+1]; // load of odd element
... // operations on a, b, c, d
A[i] = c; // store of even element
A[i+1] = d; // store of odd element
}
The loads of even and odd elements are identified as an interleave load group, which will be transfered into vectorized IRs like:
%wide.vec = load <8 x i32>, <8 x i32>* %ptr
%vec.even = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 0, i32 2, i32 4, i32 6>
%vec.odd = shufflevector <8 x i32> %wide.vec, <8 x i32> undef, <4 x i32> <i32 1, i32 3, i32 5, i32 7>
The stores of even and odd elements are identified as an interleave store group, which will be transfered into vectorized IRs like:
%interleaved.vec = shufflevector <4 x i32> %vec.even, %vec.odd, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7>
store <8 x i32> %interleaved.vec, <8 x i32>* %ptr
This optimization is currently disabled by defaut. To try it by adding '-enable-interleaved-mem-accesses=true'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239291 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Using some SCEV functionality helped to entirely remove SCEVCache class and FindConstantPointers SCEV visitor.
Also, this makes the code more universal - I'll take advandate of it in next patches where I start handling additional types of instructions.
Test Plan: Tests would be submitted in subsequent patches.
Reviewers: atrick, chandlerc
Reviewed By: atrick, chandlerc
Subscribers: atrick, llvm-commits
Differential Revision: http://reviews.llvm.org/D10205
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239282 91177308-0d34-0410-b5e6-96231b3b80d8