It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.
Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86). So this
commit moves the sret handling down a level.
rdar://problem/17242889
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211155 91177308-0d34-0410-b5e6-96231b3b80d8
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211130 91177308-0d34-0410-b5e6-96231b3b80d8
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.
The same optimization is applied to conditional branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211126 91177308-0d34-0410-b5e6-96231b3b80d8
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".
Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.
rdar://problem/17277590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211077 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patches allows non conversions like i1=i2; where both are global ints.
In addition, arithmetic and other things start to work since fast-isel will use
existing patterns for non fast-isel from tablegen files where applicable.
In addition i8, i16 will work in this limited context for assignment without the need
for sign extension (zero or signed). It does not matter how i8 or i16 are loaded (zero or sign extended)
since only the 8 or 16 relevant bits are used and clang will ask for sign extension before using them in
arithmetic. This is all made more complete in forthcoming patches.
for example:
int i, j=1, k=3;
void foo() {
i = j + k;
}
Keep in mind that this pass is not enabled right now and is an experimental pass
It can only be enabled with a hidden option to llvm of -mips-fast-isel.
Test Plan: Run test-suite, loadstore2.ll and I will run some executable tests.
Reviewers: dsanders
Subscribers: mcrosier
Differential Revision: http://reviews.llvm.org/D3856
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211061 91177308-0d34-0410-b5e6-96231b3b80d8
Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal
code generation for forming a function address that is local to the compile
unit. The existing code was treating both local and non-local functions
identically.
This patch fixes the problem by properly identifying local functions and
generating the proper addis/addi code. I also noticed that Rafael's earlier
changes to correct the surrounding code in PPCISelLowering.cpp were also
needed for fast instruction selection in PPCFastISel.cpp, so this patch
fixes that code as well.
The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new
code generation. I've added a -O0 run line to test the fast-isel code as
well.
Tested on powerpc64[le]-unknown-linux-gnu with no regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211056 91177308-0d34-0410-b5e6-96231b3b80d8
ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit
operations should work as normal, but 64-bit ones are almost certainly
doomed.
Patch by Phoebe Buckheister.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211042 91177308-0d34-0410-b5e6-96231b3b80d8
On x86_86 the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.
Test case included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211040 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There is no change to the restrictions, just the result register is stored
once in the encoding rather than twice. The rt field is zero in
MIPS32r6/MIPS64r6.
Depends on D4119
Reviewers: zoran.jovanovic, jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4120
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211019 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The linked-load, store-conditional operations have been re-encoded such
that have a 9-bit offset instead of the 16-bit offset they have prior to
MIPS32r6/MIPS64r6.
While implementing this, I noticed that the atomic load/store pseudos always
emit a sign extension using sll and sra. I have improved this to use seb/seh
when they are available (MIPS32r2/MIPS64r2 and above).
Depends on D4118
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211018 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There is very little difference between the big and little endian cases in
test/CodeGen/Mips/atomic.ll. Merge them together using multiple
FileCheck prefixes.
Depends on D4117
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4118
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211013 91177308-0d34-0410-b5e6-96231b3b80d8
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.
Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.
Piggyback a minor refactor in InstCombineCasts.cpp
Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211004 91177308-0d34-0410-b5e6-96231b3b80d8
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
This would assert if a constant address space was extern
and therefore didn't have an initializer. If the initializer
was undef, it would hit the unreachable unhandled initializer case.
An extern global should never really occur since we don't have
machine linking, but bugpoint likes to remove initializers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210967 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is to move GlobalMerge pass from Transform/Scalar
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210951 91177308-0d34-0410-b5e6-96231b3b80d8
This doesn't fix the abstract variable handling yet, but it introduces a
similar delay mechanism as was added for subprograms, causing
DW_AT_location to be reordered to the beginning of the attribute list
for local variables, and fixes all the test fallout for that.
A subsequent commit will remove the abstract variable handling in
DbgVariable and just do the abstract variable lookup at module end to
ensure that abstract variables introduced after their concrete
counterparts are appropriately referenced by the concrete variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210943 91177308-0d34-0410-b5e6-96231b3b80d8
Lowering this new node allows us to fold the almost universal
comparison for success before it's even formed. Instead we can create
a copy from EFLAGS and an X86ISD::SETCC operation since all "cmpxchg"
instructions set the zero-flag to the correct value.
rdar://problem/13201607
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210923 91177308-0d34-0410-b5e6-96231b3b80d8
This also simplifies the IR we create slightly: instead of working out
where success & failure should go manually, it turns out we can just
always jump to a success/failure block created for the purpose. Later
phases will sort out the mess without much difficulty.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210917 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a weak variant of the cmpxchg operation, as described
in C++11. A cmpxchg instruction with this modifier is permitted to
fail to store, even if the comparison indicated it should.
As a result, cmpxchg instructions must return a flag indicating
success in addition to their original iN value loaded. Thus, for
uniformity *all* cmpxchg instructions now return "{ iN, i1 }". The
second flag is 1 when the store succeeded.
At the DAG level, a new ATOMIC_CMP_SWAP_WITH_SUCCESS node has been
added as the natural representation for the new cmpxchg instructions.
It is a strong cmpxchg.
By default this gets Expanded to the existing ATOMIC_CMP_SWAP during
Legalization, so existing backends should see no change in behaviour.
If they wish to deal with the enhanced node instead, they can call
setOperationAction on it. Beware: as a node with 2 results, it cannot
be selected from TableGen.
Currently, no use is made of the extra information provided in this
patch. Test updates are almost entirely adapting the input IR to the
new scheme.
Summary for out of tree users:
------------------------------
+ Legacy Bitcode files are upgraded during read.
+ Legacy assembly IR files will be invalid.
+ Front-ends must adapt to different type for "cmpxchg".
+ Backends should be unaffected by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210903 91177308-0d34-0410-b5e6-96231b3b80d8
When targetting Thumb1 on a processor which has a VFP unit (which
is not accessible from Thumb1), we were converting the fastcc calling
convention to AAPCS-VFP, which is not possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210889 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the cvttss2si/cvttsd2si intrinsics. Preceding
insertelement instructions are folded into the conversion instruction (if
possible).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210870 91177308-0d34-0410-b5e6-96231b3b80d8
Add branch weights to branch instructions, so that the following passes can
optimize based on it (i.e. basic block ordering).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210863 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds MachineMemOperands to load and store instructions. This allows
the peephole optimizer to fold load instructions. Unfortunatelly the peephole
optimizer currently doesn't run at -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210858 91177308-0d34-0410-b5e6-96231b3b80d8
Delete all unused ones, and add new AMDGPU named intrinsics for
the ones that are. Handle the old AMDIL names for comptability (although
remove their GCCBuiltin names) and add tests since there weren't any
for these before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210827 91177308-0d34-0410-b5e6-96231b3b80d8
Recommit with fixed argument attribute checking code, which is required to bail
out of all the cases we don't handle yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210815 91177308-0d34-0410-b5e6-96231b3b80d8