No functional change because "lsl #12" is actually encoded as 12, but one less
bug if someone ever decides to change that for the giggles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243536 91177308-0d34-0410-b5e6-96231b3b80d8
Given certain shuffle-vector masks, LLVM emits splat instructions
which splat the wrong bytes from the source register. The issue is
that the function PPC::isSplatShuffleMask() in PPCISelLowering.cpp
does not ensure that the splat pattern found is requesting bytes that
are aligned on an EltSize boundary. This patch detects this situation
as not a valid splat mask, resulting in a permute being generated
instead of a splat.
Patch and test case by Tyler Kenney, cleaned up a bit by me.
This is a simple bug fix that would be good to incorporate into 3.7.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243519 91177308-0d34-0410-b5e6-96231b3b80d8
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -aarch64-strict-align to decide whether strict alignment should be
forced.
rdar://problem/21529937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243516 91177308-0d34-0410-b5e6-96231b3b80d8
This fix was suggested as part of D11345 and is part of fixing PR24141.
With this change, we can avoid walking the uses of a divisor node if the target
doesn't want the combineRepeatedFPDivisors transform in the first place.
There is no NFC-intended other than that.
Differential Revision: http://reviews.llvm.org/D11531
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243498 91177308-0d34-0410-b5e6-96231b3b80d8
This commit defines subtarget feature strict-align and uses it instead of
cl::opt -arm-strict-align to decide whether strict alignment should be
forced. Also, remove the logic that was checking the OS and architecture
as clang is now responsible for setting strict-align based on the command
line options specified and the target architecute and OS.
rdar://problem/21529937
http://reviews.llvm.org/D11470
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243493 91177308-0d34-0410-b5e6-96231b3b80d8
Reapply 243271 with more fixes; although we are not handling multiple
sources with coalescable copies, we were not properly skipping this
case.
- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.
With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:
A:
psllq %mm1, %mm0
movd %mm0, %r9
jmp C
B:
por %mm1, %mm0
movd %mm0, %r9
jmp C
C:
movd %r9, %mm0
pshufw $238, %mm0, %mm0
Becomes:
A:
psllq %mm1, %mm0
jmp C
B:
por %mm1, %mm0
jmp C
C:
pshufw $238, %mm0, %mm0
Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243486 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently, we support only the MIPS O32 ABI calling convention for call
lowering. With this change we avoid using the O32 calling convetion for
lowering calls marked as using the fast calling convention.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11515
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243485 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Generate correct code for the select instruction by zero-extending
it's boolean/condition operand to GPR-width. This is necessary because
the conditional-move instructions operate on the whole register.
Reviewers: dsanders
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11506
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243469 91177308-0d34-0410-b5e6-96231b3b80d8
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243462 91177308-0d34-0410-b5e6-96231b3b80d8
The 'common' section TLS is not implemented.
Current C/C++ TLS variables are not placed in common section.
DWARF debug info to get the address of TLS variables is not generated yet.
clang and driver changes in http://reviews.llvm.org/D10524
Added -femulated-tls flag to select the emulated TLS model,
which will be used for old targets like Android that do not
support ELF TLS models.
Added TargetLowering::LowerToTLSEmulatedModel as a target-independent
function to convert a SDNode of TLS variable address to a function call
to __emutls_get_address.
Added into lib/Target/*/*ISelLowering.cpp to call LowerToTLSEmulatedModel
for TLSModel::Emulated. Although all targets supporting ELF TLS models are
enhanced, emulated TLS model has been tested only for Android ELF targets.
Modified AsmPrinter.cpp to print the emutls_v.* and emutls_t.* variables for
emulated TLS variables.
Modified DwarfCompileUnit.cpp to skip some DIE for emulated TLS variabls.
TODO: Add proper DIE for emulated TLS variables.
Added new unit tests with emulated TLS.
Differential Revision: http://reviews.llvm.org/D10522
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243438 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add patterns for doing floating point round with various rounding modes
followed by conversion to int as a single FCVT* instruction.
Reviewers: t.p.northover, jmolloy
Subscribers: aemerson, rengolin, mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D11424
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243422 91177308-0d34-0410-b5e6-96231b3b80d8
This path add the aarch64 lowering of __builtin_thread_pointer. It uses
the already implemented AArch64ISD::THREAD_POINTER used in TLS generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243412 91177308-0d34-0410-b5e6-96231b3b80d8
X86FrameLowering has both a mergeSPUpdates() that accepts a direction, and an
mergeSPUpdatesUp(), which seem to do the same thing, except for a slightly
different interface. Removed the less general function.
NFC.
Differential Revision: http://reviews.llvm.org/D11510
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243396 91177308-0d34-0410-b5e6-96231b3b80d8
VPAND is a lot faster than VPSHUFB and VPBLENDVB - this patch ensures we attempt to lower to a basic bitmask before lowering to the slower byte shuffle/blend instructions.
Split off from D11518.
Differential Revision: http://reviews.llvm.org/D11541
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243395 91177308-0d34-0410-b5e6-96231b3b80d8
This is a follow-up to the FIXME that was added with D7474 ( http://reviews.llvm.org/rL229531 ).
I thought this load folding bug had been made hard-to-hit, but it turns out to be very easy
when targeting 32-bit x86 and causes a miscompile/crash in Wine:
https://bugs.winehq.org/show_bug.cgi?id=38826https://llvm.org/bugs/show_bug.cgi?id=22371#c25
The quick fix is to simply remove the scalar FP logical instructions from the load folding table
in X86InstrInfo, but that causes us to miss load folds that should be possible when lowering fabs,
fneg, fcopysign. So the majority of this patch is altering those lowerings to use *vector* FP
logical instructions (because that's all x86 gives us anyway). That lets us do the load folding
legally.
Differential Revision: http://reviews.llvm.org/D11477
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243361 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: WebAssemblySubtarget.cpp expects a default 'generic' CPU to exist, and this seems to be prevalent with other targets. It makes sense to have something between MVP and bleeding-edge, even though for now it's the same as MVP. This removes a warning that's currently generated.
Subscribers: jfb, llvm-commits, sunfish
Differential Revision: http://reviews.llvm.org/D11546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243345 91177308-0d34-0410-b5e6-96231b3b80d8
be reserved.
The decision to reserve x18 is going to be made solely by the front-end,
so it isn't necessary to check if the OS is Darwin in the backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243308 91177308-0d34-0410-b5e6-96231b3b80d8
There is an ODR conflict between lib/ExecutionEngine/ExecutionEngineBindings.cpp
and lib/Target/TargetMachineC.cpp. The inline definitions should simply
be marked static (thanks dblaikie for the hint).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243298 91177308-0d34-0410-b5e6-96231b3b80d8
Author: Dave Airlie <airlied@redhat.com>
In order to implement indirect sampler loads, we don't
want to match on a VGPR load but an SGPR one for constants,
as we cannot feed VGPRs to the sampler only SGPRs.
this should be applicable for llvm 3.7 as well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243294 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r243146.
Feedback from Craig Topper and David Blaikie was that we don't put const on Type as it has no mutable state.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243282 91177308-0d34-0410-b5e6-96231b3b80d8
Reapply r242295 with fixes in the implementation.
- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.
With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:
A:
psllq %mm1, %mm0
movd %mm0, %r9
jmp C
B:
por %mm1, %mm0
movd %mm0, %r9
jmp C
C:
movd %r9, %mm0
pshufw $238, %mm0, %mm0
Becomes:
A:
psllq %mm1, %mm0
jmp C
B:
por %mm1, %mm0
jmp C
C:
pshufw $238, %mm0, %mm0
Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243271 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Fix the cost of interleaved accesses for ARM/AArch64.
We were calling getTypeAllocSize and using it to check
the number of bits, when we should have called
getTypeAllocSizeInBits instead.
This would pottentially cause the vectorizer to
generate loads/stores and shuffles which cannot
be matched with an interleaved access instruction.
No performance changes are expected for now since
matching/generating interleaved accesses is still
disabled by default.
Reviewers: rengolin
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D11524
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243270 91177308-0d34-0410-b5e6-96231b3b80d8
When truncating to non-legal types (such as i16, i8 and i1) always use an AND
instruction to mask out the upper bits. This was only done when the source type
was an i64, but not when the source type was an i32.
This commit fixes this and adds the missing i32 truncate tests.
This fixes rdar://problem/21990703.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243198 91177308-0d34-0410-b5e6-96231b3b80d8
extension property we're requesting - zero or sign extended.
This fixes cases where we want to return a zero extended 32-bit -1
and not be sign extended for the entire register. Also updated the
already out of date comment with the current behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243192 91177308-0d34-0410-b5e6-96231b3b80d8
whether register x18 should be reserved.
This change is needed because we cannot use a backend option to set
cl::opt "aarch64-reserve-x18" when doing LTO.
Out-of-tree projects currently using cl::opt option "-aarch64-reserve-x18"
to reserve x18 should make changes to add subtarget feature "reserve-x18"
to the IR.
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D11463
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243186 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of the pattern
for (auto I = x.rbegin(), E = x.end(); I != E; ++I)
we can use make_range to construct the reverse range and iterate using
that instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243163 91177308-0d34-0410-b5e6-96231b3b80d8
We had a few places where we did
for (unsigned i = 0, e = STy->getNumElements(); i != e; ++i) {
but those could instead do
for (auto *EltTy : STy->elements()) {
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243136 91177308-0d34-0410-b5e6-96231b3b80d8