In this patch I have only encoding. Intrinsics and DAG lowering will be in the next patch.
I temporary removed the old intrinsics test (just to split this patch).
Half types are not covered here.
Differential Revision: http://reviews.llvm.org/D11134
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242023 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This at least saves compile time. I also encountered a case where
ephemeral values affect whether other variables are promoted, causing
performance issues. It may be a bug in LSR, but I didn't manage to
reduce it yet. Anyhow, I believe it's in general not worth considering
ephemeral values in LSR.
Reviewers: atrick, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11115
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242011 91177308-0d34-0410-b5e6-96231b3b80d8
Register r12 ('ip') is used by GCC for this purpose
and hence is used here. As discussed on the GCC mailing
list, the register choice is an ABI issue and so
choosing the same register as GCC means
__builtin_call_with_static_chain is compatible.
A similar patch has just gone in the AArch64 backend,
so this is just the ARM counterpart, following the same
discussion.
Patch by Stephen Cross.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241996 91177308-0d34-0410-b5e6-96231b3b80d8
While the v4i32 shl operation is already vectorized using a cvttps2dq/pmulld pattern, the lshr/ashr opeations are still scalarized.
This patch adds vectorization support for non-uniform v4i32 shift operations - it splats constant shift amounts to allow them to use the immediate sse shift instructions, or extracts/zero-extends non-constant shift amounts. The individual results are then blended together.
Differential Revision: http://reviews.llvm.org/D11063
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241989 91177308-0d34-0410-b5e6-96231b3b80d8
There is no suitable basic block to sink instructions in loops without
exits. The only way an instruction in a loop without exits can be used
is as an incoming value to a PHI. In such cases, the incoming block for
the corresponding value is unreachable.
This fixes PR24013.
Differential Revision: http://reviews.llvm.org/D10903
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241987 91177308-0d34-0410-b5e6-96231b3b80d8
r238842 added the TargetRecip system for controlling use of reciprocal
estimates for sqrt and division using a set of parameters that can be set by
the frontend. Clang now supports a sophisticated -mrecip option, and this will
allow that option to effectively control the relevant code-generation
functionality of the PPC backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241985 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for the 'nest' attribute, which allows the static chain
register to be set for functions calls under non-Darwin PPC/PPC64 targets. r11
is the chain register (which the PPC64 ELF ABI calls the "environment
pointer"). For indirect calls under PPC64 ELFv1, this would normally be loaded
from the function descriptor, but providing an explicit 'nest' parameter will
override that process and use the value provided.
This allows __builtin_call_with_static_chain to work as expected on PowerPC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241984 91177308-0d34-0410-b5e6-96231b3b80d8
r236894 caused PR23626 (Clang miscompiles webkit's base64 decoder), and was
reverted in r237984. This reapplies the patch with an additional test case for
PR23626 and the associated fix (both scales and offsets in the
BasicAliasAnalysis::constantOffsetHeuristic should initially be zero).
Patch by Nick White, thanks!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241981 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds new attribute called "argmemonly". Function marked with this attribute can only access memory through it's argument pointers. This attribute directly corresponds to the "OnlyAccessesArgumentPointees" ModRef behaviour in alias analysis.
Differential Revision: http://reviews.llvm.org/D10398
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241979 91177308-0d34-0410-b5e6-96231b3b80d8
This in turn would sometimes introduce new cleanupblocks that didn't
previously exist. The uses were being introduced by SSA value demotion.
We actually want to *promote* uses of EH pointers and selectors, so I
added some spcecial casing to avoid demoting such instructions. This is
getting overly complicated, but hopefully we'll come along and delete it
in the new representation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241950 91177308-0d34-0410-b5e6-96231b3b80d8
The motivation is to allow GatherAllAliases / FindBetterChain
to not give up on dependent loads of a pointer from constant memory.
This is important for AMDGPU, because most loads are pointers
derived from a load of a kernel argument from constant memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241948 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes PR23804: assertion failure in emitPrologue in the case of a
function with an empty frame and a dynamic alloca that needs stack
realignment. This is a typical case for AddressSanitizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241943 91177308-0d34-0410-b5e6-96231b3b80d8
This commit factors out common code from MergeBaseUpdateLoadStore() and
MergeBaseUpdateLSMultiple() and introduces a new function
MergeBaseUpdateLSDouble() which merges adds/subs preceding/following a
strd/ldrd instruction into an strd/ldrd instruction with writeback where
possible.
Differential Revision: http://reviews.llvm.org/D10676
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241928 91177308-0d34-0410-b5e6-96231b3b80d8
If our two inputs have known top-zero bit counts M and N, we trivially
know that the output cannot have any bits set in the top (min(M, N)-1)
bits, since nothing could carry past that point.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241927 91177308-0d34-0410-b5e6-96231b3b80d8
This commit implements the initial serialization of stack objects from the
MachineFrameInfo class. It can only serialize the ordinary stack objects
(including ordinary spill slots), but it doesn't serialize variable sized or
fixed stack objects yet.
The stack objects are serialized using a YAML sequence of YAML inline mappings.
Each mapping has the object's ID, type, size, offset and alignment. The stack
objects are a part of machine function's YAML mapping.
Reviewers: Duncan P. N. Exon Smith
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241922 91177308-0d34-0410-b5e6-96231b3b80d8
This improves the logic in several ways and is a preparation for
followup patches:
- First perform an analysis and create a list of merge candidates, then
transform. This simplifies the code in that you have don't have to
care to much anymore that you may be holding iterators to
MachineInstrs that get removed.
- Analyze/Transform basic blocks in reverse order. This allows to use
LivePhysRegs to find free registers instead of the RegisterScavenger.
The RegisterScavenger will become less precise in the future as it
relies on the deprecated kill-flags.
- Return the newly created node in MergeOps so there's no need to look
around in the schedule to find it.
- Rename some MBBI iterators to InsertBefore to make their role clear.
- General code cleanup.
Differential Revision: http://reviews.llvm.org/D10140
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241920 91177308-0d34-0410-b5e6-96231b3b80d8
FCmp behaves a lot like a floating-point binary operator in many ways,
and can benefit from fast-math information. Flags such as nsz and nnan
can affect if this fcmp (in combination with a select) can be treated
as a fminnum/fmaxnum operation.
This adds backwards-compatible bitcode support, IR parsing and writing,
LangRef changes and IRBuilder changes. I'll need to audit InstSimplify
and InstCombine in a followup to find places where flags should be
copied.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241901 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support. Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.
Reviewers: rnk, JosephTremoulet, reames, nlewycky, rjmccall
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11041
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241888 91177308-0d34-0410-b5e6-96231b3b80d8
Not doing this can lead to misoptimizations down the line, e.g. because
of range metadata on the replacing load excluding values that are valid
for the load that is being replaced.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241886 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Without this patch, LoopVectorizer in certain cases (see loop-vectorize.ll)
produces code with complex control flow which hurts later optimizations. Since
NVPTX doesn't have vector registers in LLVM's sense
(NVPTXTTI::getRegisterBitWidth(true) == 32), we for now declare no vector
registers to effectively disable loop vectorization.
Reviewers: jholewinski
Subscribers: jingyue, llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D11089
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241884 91177308-0d34-0410-b5e6-96231b3b80d8
Apparently this is important, otherwise _except_handler3 assumes that
the registration node is corrupted and ignores it.
Also fix a bug in WinEHPrepare where we would insert code after a
terminator instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241877 91177308-0d34-0410-b5e6-96231b3b80d8
The virtual registers are serialized using a YAML sequence of YAML inline
mappings. Each mapping has the id of the virtual register and the register
class.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10981
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241868 91177308-0d34-0410-b5e6-96231b3b80d8
The runtime does not restore CSRs when transferring control back to the
function handling the exception. According to the experts on IRC, LLVM's
register allocator has no way to model register clobbers that only
happen on one edge of the CFG. For now, don't worry about trying to use
the meager three CSRs available on 32-bit X86 and just say that such
invokes preserve nothing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241865 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a new error which is reported when the MIR Parser encounters
a machine function without any machine basic blocks. The machine verifier
expects that the machine functions have at least one MBB, and this error will
prevent machine functions without MBBs from reaching the machine verifier and
crashing with an assertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241862 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Before this change ImplicitNullChecks would only pick loads of the form:
```
test Reg, Reg
jz elsewhere
fallthrough:
movl 32(Reg), Reg2
```
but not (say)
```
test Reg, Reg
jz elsewhere
fallthrough:
inc Reg3
movl 32(Reg), Reg2
```
This change teaches ImplicitNullChecks to look through "unrelated"
instructions like `inc Reg3` when searching for a load instruction
to convert to a trapping load.
Reviewers: atrick, JosephTremoulet, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11044
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241850 91177308-0d34-0410-b5e6-96231b3b80d8
They should probably be created on anything that is not windows or linux, but I will
test on freebsd before changing that.
With this it is possible to bootstrap with llvm-ar instead of ar+ranlib on OS X.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241849 91177308-0d34-0410-b5e6-96231b3b80d8
This commit serializes the 13 scalar boolean and integer attributes from the
MachineFrameInfo class: IsFrameAddressTaken, IsReturnAddressTaken, HasStackMap,
HasPatchPoint, StackSize, OffsetAdjustment, MaxAlignment, AdjustsStack,
HasCalls, MaxCallFrameSize, HasOpaqueSPAdjustment, HasVAStart, and
HasMustTailInVarArgFunc. These attributes are serialized as part
of the frameInfo YAML mapping, which itself is a part of the machine function's
YAML mapping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241844 91177308-0d34-0410-b5e6-96231b3b80d8
It looks like ld64 requires it. With this we seem to be able to bootstrap using
llvm-ar+/usr/bin/true instead of ar+ranlib (currently on stage2).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241842 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In RewriteLoopExitValues, before expanding out an SCEV expression using
SCEVExpander, try to see if an existing LLVM IR expression already
computes the value we're interested in. If so use that existing
expression.
Apart from reducing IndVars' reliance on the rest of the compilation
pipeline, this also prevents IndVars from concluding some expressions as
"high cost" when they're not. For instance,
`InductiveRangeCheckElimination` often emits code of the following form:
```
len = umin(len_A, len_B)
loop:
...
if (i++ < len)
goto loop
outside_loop:
use(i)
```
`SCEVExpander` refuses to rewrite the use of `i` in `outside_loop`,
since it thinks the value of `i` on loop exit, `len`, is a high cost
expansion since it contains an `umax` in it. With this change,
`IndVars` can see that it can re-use `len` instead of creating a new
expression to compute `umin(len_A, len_B)`.
I considered putting this cleverness in `SCEVExpander`, but I was
worried that it may then have a deterimental effect on other passes
that use it. So I decided it was better to just do this in the one
place where it seems like an obviously good idea, with the intent of
generalizing later if needed.
Reviewers: atrick, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10782
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241838 91177308-0d34-0410-b5e6-96231b3b80d8
Don't let the disassembler pick call <.text> if a function happens to
live at the start of the section by only using function symbols.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241830 91177308-0d34-0410-b5e6-96231b3b80d8
This patch allows the read_register and write_register intrinsics to
read/write the RBP/EBP registers on X86 iff the targeted register is
the frame pointer for the containing function.
Differential Revision: http://reviews.llvm.org/D10977
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241827 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes bugs that were exposed by the addition of fast-math-flags in the DAG:
r237046 ( http://reviews.llvm.org/rL237046 ):
1. When replacing a division node, it's not enough to RAUW.
We should call CombineTo() to delete dead nodes and combine again.
2. Because we are changing the DAG, we can't return an empty SDValue
after the transform. As the code comments say:
Visitation implementation - Implement dag node combining for different node types.
The semantics are as follows: Return Value:
SDValue.getNode() == 0 - No change was made
SDValue.getNode() == N - N was replaced, is dead and has been handled.
otherwise - N should be replaced by the returned Operand.
The new test case shows no difference with or without this patch, but it will crash if
we re-apply r237046 or enable FMF via the current -enable-fmf-dag cl::opt.
Differential Revision: http://reviews.llvm.org/D9893
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241826 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We were missing a corner case where DepCands was not available,
but we were using DepCands to compute the checking pointer
groups.
This adds a test for that regression.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241818 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The checking pointer group construction algorithm relied on the iteration on DepCands.
We would need the same leaders across runs and the same iteration order over the underlying std::set for determinism.
This changes the algorithm to process the pointers in the order in which they were added to the runtime check, which is deterministic.
We need to update the tests, since the order in which pointers appear has changed.
No new tests were added, since it is impossible to test for non-determinism.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11064
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241809 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.
Reviewers: nadav, majnemer, sanjoy, RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: http://reviews.llvm.org/D10767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241806 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: If shift amount is a constant value > 64 bit it is handled incorrectly during type legalization and X86 lowering. This patch the type of shift amount argument in function DAGTypeLegalizer::ExpandShiftByConstant from unsigned to APInt.
Reviewers: nadav, majnemer, sanjoy, RKSimon
Subscribers: RKSimon, llvm-commits
Differential Revision: http://reviews.llvm.org/D10767
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241790 91177308-0d34-0410-b5e6-96231b3b80d8
The justification of this change is here: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html
According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.
%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets
In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.
(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset
In all cases the %A type is <4 x i8*>
In the case (2) we add the same offset to all pointers.
The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].
The documentation is updated.
http://reviews.llvm.org/D10496
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241788 91177308-0d34-0410-b5e6-96231b3b80d8
Column information is present in CodeView when the line table subsection
has bit 0 set to 1 in it's flags field. The column information is
represented as a pair of 16-bit quantities: a starting and ending
column. This information is present at the end of the chunk, after all
the line-PC pairs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241764 91177308-0d34-0410-b5e6-96231b3b80d8
This commit ([LAA] Fix estimation of number of memchecks) regressed the
logic a bit. We shouldn't quit the analysis if we encounter a pointer
without known bounds *unless* we actually need to emit a memcheck for
it.
The original code was using NumComparisons which is now computed
differently. Instead I compute NeedRTCheck from NumReadPtrChecks and
NumWritePtrChecks.
As side note, I find the separation of NeedRTCheck and CanDoRT
confusing, so I will try to merge them in a follow-up patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241756 91177308-0d34-0410-b5e6-96231b3b80d8
All the usual X86 target-specific conventions are collapsed to the
normal Win64 convention, but the custom conventions like GHC and webkit
should not be.
Previously we would assume that the caller allocated 32 bytes of shadow
space for us, which is not how webkit_jscc or other custom conventions
are supposed to work.
Based on a patch by peavo@outlook.com.
Fixes PR24051.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241725 91177308-0d34-0410-b5e6-96231b3b80d8
No support for the symbol table yet (but will hopefully add it today).
We always use the long filename format so that we can align the member,
which is an advantage of the BSD format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241721 91177308-0d34-0410-b5e6-96231b3b80d8
This commit changes the type of the field 'Name' in the struct
'yaml::MachineBasicBlock' from 'std::string' to 'yaml::StringValue'. This change
allows the MIR parser to report errors related to the MBB name with the proper
source locations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241718 91177308-0d34-0410-b5e6-96231b3b80d8
The inferred output file name is based on the first input file, not the
first one with extension .obj. The output file was also being written to
the wrong directory; it needs to be written to whichever directory on the
libpath it was found in. This change fixes both issues.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241710 91177308-0d34-0410-b5e6-96231b3b80d8
The 32-bit lowering assumed that WinEHPrepare had this invariant.
WinEHPrepare did it for C++, but not SEH. The result was that we would
insert calls to llvm.x86.seh.restoreframe in normal basic blocks, which
corrupted the frame pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241699 91177308-0d34-0410-b5e6-96231b3b80d8
- Implement copying ASR to/from GPR regs.
- Mark ASRs as non-allocatable, so it won't try to arbitrarily use
them inappropriately.
- Instead of inserting explicit WRASR/RDASR nodes in the MUL/DIV
routines, just do normal register copies.
- Also...mark div as using Y, not just writing it.
Added a test case with some code which previously died with an
assertion failure (with -O0), or produced wrong code (otherwise).
(Third time's the charm?)
Differential Revision: http://reviews.llvm.org/D10401
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241686 91177308-0d34-0410-b5e6-96231b3b80d8
Use AddressAlign field's value to properly align sections content in the
yaml2obj tool. Before this change the yaml2obj ignored AddressAlign and
always aligned section on 16 bytes boundary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241674 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Often filter-like loops will do memory accesses that are
separated by constant offsets. In these cases it is
common that we will exceed the threshold for the
allowable number of checks.
However, it should be possible to merge such checks,
sice a check of any interval againt two other intervals separated
by a constant offset (a,b), (a+c, b+c) will be equivalent with
a check againt (a, b+c), as long as (a,b) and (a+c, b+c) overlap.
Assuming the loop will be executed for a sufficient number of
iterations, this will be true. If not true, checking against
(a, b+c) is still safe (although not equivalent).
As long as there are no dependencies between two accesses,
we can merge their checks into a single one. We use this
technique to construct groups of accesses, and then check
the intervals associated with the groups instead of
checking the accesses directly.
Reviewers: anemet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241673 91177308-0d34-0410-b5e6-96231b3b80d8
option that works with all object container formats.
Now that clang modules/PCH are object containers this option is useful to
to construct pipes like
llvm-objdump -raw-clang-ast foo.pcm | llvm-bcanalyzer -
to inspect the AST contents in a PCH container.
Will be tested via clang.
Belatedly addresses review feedback for r233390.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241659 91177308-0d34-0410-b5e6-96231b3b80d8
The incoming EBP value points to the end of a local stack allocation, so
we can use that to restore ESI, the base pointer. Once we do that, we
can use local stack allocations. If we know we need stack realignment,
spill the original frame pointer in the prologue and reload it after
restoring ESI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241648 91177308-0d34-0410-b5e6-96231b3b80d8
Clang uses this for SEH finally. The new intrinsic will produce the
right value when stack realignment is required.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241643 91177308-0d34-0410-b5e6-96231b3b80d8
Tim Northover has told me that they can occur when the compiler cleverly
constructs constants - as demonstrated in the test case.
rdar://21703486
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241641 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Initially, these intrinsics seemed like part of a family of "frame"
related intrinsics, but now I think that's more confusing than helpful.
Initially, the LangRef specified that this would create a new kind of
allocation that would be allocated at a fixed offset from the frame
pointer (EBP/RBP). We ended up dropping that design, and leaving the
stack frame layout alone.
These intrinsics are really about sharing local stack allocations, not
frame pointers. I intend to go further and add an `llvm.localaddress()`
intrinsic that returns whatever register (EBP, ESI, ESP, RBX) is being
used to address locals, which should not be confused with the frame
pointer.
Naming suggestions at this point are welcome, I'm happy to re-run sed.
Reviewers: majnemer, nicholas
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11011
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241633 91177308-0d34-0410-b5e6-96231b3b80d8
GNU binutils provides this behavior. objdump -r doesn't really help
when you aren't dealing with relocation object files.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241631 91177308-0d34-0410-b5e6-96231b3b80d8
Since the NvCast is generated by the selection process the concerns about
endianess and bit reversal don't apply.
rdar://21703486
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241611 91177308-0d34-0410-b5e6-96231b3b80d8
getSymbolValue now returns a value that in convenient for most callers:
* 0 for undefined
* symbol size for common symbols
* offset/address for symbols the rest
Code that needs something more specific can check getSymbolFlags.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241605 91177308-0d34-0410-b5e6-96231b3b80d8
This commit changes the target arch to fix the test case commited in r241566
that was failing on ninja-x64-msvc-RA-centos6. Also add checks to make sure
the callee's address is loaded to blx's operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241588 91177308-0d34-0410-b5e6-96231b3b80d8
They are implemented like that in some object formats, but for the interface
provided by lib/Object, SF_Undefined and SF_Common are different things.
This matches the ELF and COFF implementation and fixes llvm-nm for MachO.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241587 91177308-0d34-0410-b5e6-96231b3b80d8
be emitted.
This is needed to enable ARM long calls for LTO and enable and disable it on a
per-function basis.
Out-of-tree projects currently using EnableARMLongCalls to emit long calls
should start passing "+long-calls" to the feature string (see the changes made
to clang in r241565).
rdar://problem/21529937
Differential Revision: http://reviews.llvm.org/D9364
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241566 91177308-0d34-0410-b5e6-96231b3b80d8
This commit verifies that the parsed machine instructions contain the implicit
register operands as specified by the MCInstrDesc. Variadic and call
instructions aren't verified.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241537 91177308-0d34-0410-b5e6-96231b3b80d8
This commit serializes the implicit flag for the register machine operands. It
introduces two new keywords into the machine instruction syntax: 'implicit' and
'implicit-def'. The 'implicit' keyword is used for the implicit register
operands, and the 'implicit-def' keyword is used for the register operands that
have both the implicit and the define flags set.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10709
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241519 91177308-0d34-0410-b5e6-96231b3b80d8
The vperm2f128/vperm2i128 shuffle mask decoding was not attempting to deal with shuffles that give zero lanes. This patch fixes this so that the assembly printer can provide shuffle comments.
As this decoder is also used in X86ISelLowering for shuffle combining, I've added an early-out to match existing behaviour. The hope is that we can add zero support in the future, this would allow other ops' decodes (e.g. insertps) to be combined as well.
Differential Revision: http://reviews.llvm.org/D10593
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241516 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for v8i16 and v16i8 shuffle lowering using the immediate versions of the SSE4A EXTRQ and INSERTQ instructions. Although rather limited (they can only act on the lower 64-bits of the source vectors, leave the upper 64-bits of the result vector undefined and don't have VEX encoded variants), the instructions are still useful for the zero extension of any lane (EXTRQ) or inserting a lane into another vector (INSERTQ). Testing demonstrated that it wasn't typically worth it to use these instructions for v2i64 or v4i32 vector shuffles although they are capable of it.
As well as adding specific pattern matching for the shuffles, the patch uses EXTRQ for zero extension cases where SSE41 isn't available and its more efficient than the SSE2 'unpack' default approach. It also adds shuffle decode support for the EXTRQ / INSERTQ cases when the instructions are handling full byte-sized extractions / insertions.
From this foundation, future patches will be able to make use of the instructions for situations that use their ability to extract/insert at the bit level.
Differential Revision: http://reviews.llvm.org/D10146
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241508 91177308-0d34-0410-b5e6-96231b3b80d8
This commit adds a 'run-pass' option to llc, which instructs the compiler to run
one specific code generation pass only.
Llc already has the 'start-after' and the 'stop-after' options, and this new
option complements the other two by making it easier to write tests that want
to invoke a single pass only.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10776
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241476 91177308-0d34-0410-b5e6-96231b3b80d8
We don't have a good way to detect most situations where
DS offsets are usable on SI, so add an option to force using
them even if unsafe for debugging performance problems.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241462 91177308-0d34-0410-b5e6-96231b3b80d8
When talking about the virtual address of sections the coff spec says:
... for simplicity, compilers should set this to zero. Otherwise, it is an
arbitrary value that is subtracted from offsets during relocation.
We don't currently subtract it, so check that it is zero.
If some producer does create such files, we can change getRelocationOffset
instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241447 91177308-0d34-0410-b5e6-96231b3b80d8
Followup to D10433 and D10589 that fixes i8/i16 uint2fp vector conversions by zero extending to i32 and using the sint2fp path (unless the target does actually support uint2fp).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241394 91177308-0d34-0410-b5e6-96231b3b80d8
Requested by Eugene Rozenfeld of the LLILC team, this feature allows JIT
clients to skip relocations for selected external symbols by returning ~0ULL
from their symbol resolver. If this value is returned for a given symbol,
RuntimeDyld will skip all relocations for that symbol. The client will be
responsible for applying the skipped relocations manually before the code
is executed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241383 91177308-0d34-0410-b5e6-96231b3b80d8
SHT_NOBITS sections do not have content in an object file. Now the yaml2obj
tool does not accept `Content` field for such sections, and the obj2yaml
tool does not attempt to read the section content from a file.
Restore r241350 and r241352.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241377 91177308-0d34-0410-b5e6-96231b3b80d8
r241350 broke lld tests.
r241352 depends on r241350.
Original messages:
"[ELFYAML] Fix handling SHT_NOBITS sections by obj2yaml/yaml2obj tools"
"[ELFYAML] Make the Size field for .bss section optional"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241354 91177308-0d34-0410-b5e6-96231b3b80d8
SHT_NOBITS sections do not have content in an object file. Now yaml2obj
tool does not accept `Content` field for such sections, and obj2yaml
tool does not attempt to read the section content from a file.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241350 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for sign extension for sub 128-bit vectors, such as to v2i32. It concatenates with UNDEF subvectors up to 128-bits, performs the sign extension (i.e. as v4i32) and then extracts the target subvector.
Patch 1/2 of D10589 - the second patch covers the conversion of v2i8/v2i16 to v2f64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241323 91177308-0d34-0410-b5e6-96231b3b80d8
The assertion in getCopyFromPartsVector assumed that the vector 'part' must
match the type of argument (arguments are potentially split into multiple
parts). However, in some cases the targets return a 'part' of the right size
but with a different type. We already handle this case correctly later on
and generate a bitcast. This commit just makes sure that we are actually
checking the property that we care about.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241312 91177308-0d34-0410-b5e6-96231b3b80d8
This commit changes normal isel and fast isel to read the user-defined trap
function name from function attribute "trap-func-name" attached to llvm.trap or
llvm.debugtrap instead of from TargetOptions::TrapFuncName. This is needed to
use clang's command line option "-ftrap-function" for LTO and enable changing
the trap function name on a per-call-site basis.
Out-of-tree projects currently using TargetOptions::TrapFuncName to specify the
trap function name should attach attribute "trap-func-name" to the call sites
of llvm.trap and llvm.debugtrap instead.
rdar://problem/21225723
Differential Revision: http://reviews.llvm.org/D10832
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241305 91177308-0d34-0410-b5e6-96231b3b80d8
In r241285, I removed the SUBREG_TO_REG restriction from VSX swap
removal, determining that this was overly conservative. We have
another form of the same restriction in that we check for the presence
of implicit subregs in vector operations. As with SUBREG_TO_REG for
partial register conversions, an implicit subreg is safe in and of
itself, provided no other operation makes a lane-sensitive assumption
about the result. This patch removes that restriction, by removing
the HasImplicitSubreg flag and all code that relies on it.
I've added a test case that fails to optimize before this patch is
applied, and optimizes properly with the patch. Test based on a
report from Anton Blanchard.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241290 91177308-0d34-0410-b5e6-96231b3b80d8
With a previous patch, the VSX swap optimization is able to recognize
the doubleword load-splat idiom that can be implemented using lxvdsx.
However, that does not cover a doubleword splat where the source is a
register. We can implement this using xxspltd (a special form of
xxpermdi). This patch teaches the swap optimization pass about this
idiom.
As a prerequisite, it also permits swap optimization to succeed for
all forms of SUBREG_TO_REG. Previously we were conservative and only
allowed SUBREG_TO_REG when it copied a full register. However, on
reflection any form of SUBREG_TO_REG is safe in and of itself, so long
as an unsafe operation is not performed on its result. In particular,
a widening SUBREG_TO_REG often occurs as an input to a doubleword
splat idiom, particularly in auto-vectorized code.
The doubleword splat idiom is an XXPERMDI operation where both source
registers are identical, and the selection mask is either 0 (splat the
first element) or 3 (splat the second element). To determine whether
the registers are identical, we use the existing mechanism for looking
through "copy-like" operations. That mechanism has a side effect of
marking the XXPERMDI operation as using a physical register, which
would invalidate its presence in a swap-optimized region. This is
correct for the form of XXPERMDI that performs a swap and hence would
be removed, but is not what we want for a doubleword-splat variety of
XXPERMDI. Therefore we reset the physical-register flag on the
XXPERMDI when it represents a splat.
A simple test case is added to verify that we generate the splat and
that we also remove the xxswapd instructions that would otherwise be
associated with the load and store of another operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241285 91177308-0d34-0410-b5e6-96231b3b80d8
When trying to upgrade @llvm.x86.sse2.psrl.dq while parsing a module,
BitcodeReader adds the function to its worklist twice, resulting in a
crash when accessing it the second time.
This patch replaces the worklist vector by a map.
Patch by Philip Pfaffe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241281 91177308-0d34-0410-b5e6-96231b3b80d8
The code responsible for shl folding in the DAGCombiner was assuming incorrectly that all constants are less than 64 bits. This patch simply changes the way values are compared.
It has been reverted previously because of some problems with comparing APInt with raw uint64_t. That has been fixed/changed with r241204.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241254 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
r240039 adds a test case to check that CallGraph does the right thing
with respect to non-leaf intrinsics like statepoint and patchpoint.
This ports the same test case to LazyCallGraph. LazyCallGraph already
does the right thing with respect to escaping function pointers so there
is no need to change any code.
Reviewers: chandlerc
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10582
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241226 91177308-0d34-0410-b5e6-96231b3b80d8
This checks subtarget feature compatibility for inlining by verifying
that the callee is a strict subset of the caller's features. This includes
the cpu as part of the subtarget we can get via the incoming functions as
the backend takes CPUs as feature sets.
This allows us to inline things like:
int foo() { return baz(); }
int __attribute__((target("sse4.2"))) bar() {
return foo();
}
so that generic code can be inlined into specialized functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241221 91177308-0d34-0410-b5e6-96231b3b80d8
TwoAddressInstructionPass stops after a successful commuting but 3 Addr
conversion might be good for some cases.
Consider:
int foo(int a, int b) {
return a + b;
}
Before this commit, we emit:
addl %esi, %edi
movl %edi, %eax
ret
After this commit, we try 3 Addr conversion:
leal (%rsi,%rdi), %eax
ret
Patch by Volkan Keles <vkeles@apple.com>!
Differential Revision: http://reviews.llvm.org/D10851
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241206 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly an NFC, which increases code readability (instead of
saving old terminator, generating new one in front of old, and deleting
old, we just call a function). However, it would additionaly copy
the debug location from old instruction to replacement, which
would help PR23837.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241197 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
According to PTX ISA:
For convenience, ld, st, and cvt instructions permit source and destination data operands to be wider than the instruction-type size, so that narrow values may be loaded, stored, and converted using regular-width registers. For example, 8-bit or 16-bit values may be held directly in 32-bit or 64-bit registers when being loaded, stored, or converted to other types and sizes. The operand type checking rules are relaxed for bit-size and integer (signed and unsigned) instruction types; floating-point instruction types still require that the operand type-size matches exactly, unless the operand is of bit-size type.
So, the ISA does not support load with extending/store with truncatation for floating numbers. This is reflected in setting the loadext/truncstore actions to expand in the code for floating numbers, but vectors of floating numbers are not taken care of.
As a result, loading a vector of floats followed by a fp_extend may be combined by DAGCombiner to a extload, and the extload may be lowered to NVPTXISD::LoadV2 with extending information. However, NVPTXISD::LoadV2 does not perform extending, and no extending instructions are inserted. Finally, PTX instructions with mismatched types are generated, like
ld.v2.f32 {%fd3, %fd4}, [%rd2]
This patch adds the correct actions for vectors of floats, so DAGCombiner would not create loads with extending, and correct code is generated.
Patched by Gang Hu.
Test Plan: Test case attached.
Reviewers: jingyue
Reviewed By: jingyue
Subscribers: llvm-commits, jholewinski
Differential Revision: http://reviews.llvm.org/D10876
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241191 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Offset of frame index is calculated by NVPTXPrologEpilogPass. Before
that the correct offset of stack objects cannot be obtained, which
leads to wrong offset if there are more than 2 frame objects. This patch
move NVPTXPeephole after NVPTXPrologEpilogPass. Because the frame index
is already replaced by %VRFrame in NVPTXPrologEpilogPass, we check
VRFrame register instead, and try to remove the VRFrame if there
is no usage after NVPTXPeephole pass.
Patched by Xuetian Weng.
Test Plan:
Strengthened test/CodeGen/NVPTX/local-stack-frame.ll to check the
offset calculation based on SP and SPL.
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10853
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241185 91177308-0d34-0410-b5e6-96231b3b80d8
When adding little-endian vector support for PowerPC last year, I
inadvertently disabled an optimization that recognizes a load-splat
idiom and generates the lxvdsx instruction. This patch moves the
offending logic so lxvdsx is once again generated.
This pattern is frequently generated by the vectorizer for scalar
loads of an effective constant. Previously the lxvdsx instruction was
wrongly listed as lane-sensitive for the VSX swap optimization (since
both doublewords are identical, swaps are safe). This patch fixes
this as well, so that vectorized code using lxvdsx can now have swaps
removed from the computation.
There is an existing test (@test50) in test/CodeGen/PowerPC/vsx.ll
that checks for the missing optimization. However, vsx.ll was only
being tested for POWER7 with big-endian code generation. I've added
a little-endian RUN statement and expected LE code generation for all
the tests in vsx.ll to give us a bit better VSX coverage, including
what's needed for this patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241183 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is not intended to change existing codegen behavior for any target.
It just exposes the JumpIsExpensive setting on the command-line to allow for
easier testing and emergency overrides.
Also, change the existing regression test to use FileCheck, explicitly specify
the jump-is-expensive option, and use more precise checks.
Differential Revision: http://reviews.llvm.org/D10846
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241179 91177308-0d34-0410-b5e6-96231b3b80d8
The EH code might have been deleted as unreachable and the personality
pruned while the filter is still present. Currently I'm hitting this at
-O0 due to the clang bug PR24009.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241170 91177308-0d34-0410-b5e6-96231b3b80d8
This patch teaches the AsmParser to accept add/adds/sub/subs/cmp/cmn
with a negative immediate operand and convert them as shown:
add Rd, Rn, -imm -> sub Rd, Rn, imm
sub Rd, Rn, -imm -> add Rd, Rn, imm
adds Rd, Rn, -imm -> subs Rd, Rn, imm
subs Rd, Rn, -imm -> adds Rd, Rn, imm
cmp Rn, -imm -> cmn Rn, imm
cmn Rn, -imm -> cmp Rn, imm
Those instructions are an alternate syntax available to assembly coders,
and are needed in order to support code already compiling with some other
assemblers (gas). They are documented in the "ARMv8 Instruction Set
Overview", in the "Arithmetic (immediate)" section. This makes llvm-mc
a programmer-friendly assembler !
This also fixes PR20978: "Assembly handling of adding negative numbers
not as smart as gas".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241166 91177308-0d34-0410-b5e6-96231b3b80d8
This also improves the logic of what is an error:
* getSection(uint_32): only return an error if the index is out of bounds. The
index 0 corresponds to a perfectly valid entry.
* getSection(Elf_Sym): Returns null for symbols that normally don't have
sections and error for out of bound indexes.
In many places this just moves the report_fatal_error up the stack, but those
can then be fixed in smaller patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241156 91177308-0d34-0410-b5e6-96231b3b80d8
Function static variables, typedefs and records (class, struct or union) declared inside
a lexical scope were associated with the function as their parent scope, rather than the
lexical scope they are defined or declared in.
This fixes PR19238
Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D9758
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241153 91177308-0d34-0410-b5e6-96231b3b80d8
Only consider an instruction a candidate for relaxation if the last operand of the
instruction is an expression. We previously checked whether any operand is an expression,
which is useless, since for all instructions concerned, the only operand that may be
affected by relaxation is the last one.
In addition, this removes the check for having RIP as an argument, since it was
plain wrong - even when one of the arguments is RIP, relaxation may still be needed.
This fixes PR9807.
Patch by: david.l.kreitzer@intel.com
Differential Revision: http://reviews.llvm.org/D10766
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241152 91177308-0d34-0410-b5e6-96231b3b80d8
The AArch32 assembler parses the '@' as a comment symbol, so the error message shouldn't suggest
that '@<type>' is a valid replacement when assembling for AArch32 target.
Differential Revision: http://reviews.llvm.org/D10651
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241149 91177308-0d34-0410-b5e6-96231b3b80d8
We would create a phi node with a zero initialized operand instead of
undef in the case where no value was originally available. This was
problematic for x86_mmx which has no null value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241143 91177308-0d34-0410-b5e6-96231b3b80d8
Surprisingly, this is a correctness issue: the mmx type exists for
calling convention purposes, LLVM doesn't have a zero representation for
them.
This partially fixes PR23999.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241142 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
nsw are flaky and can often be removed by optimizations. This patch enhances
nsw by leveraging @llvm.assume in the IR. Specifically, NaryReassociate now
understands that
assume(a + b >= 0) && assume(a >= 0) ==> a +nsw b
As a result, it can split more sext(a + b) into sext(a) + sext(b) for CSE.
Test Plan: nary-gep.ll
Reviewers: broune, meheff
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D10822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241139 91177308-0d34-0410-b5e6-96231b3b80d8
The incoming EBP value established by the runtime is actually a pointer
to the end of the EH registration object, and not the true parent
function frame pointer. Clang doesn't need llvm.x86.seh.exceptioninfo
anymore because we know that the exception info pointer is at a fixed
offset from this incoming EBP.
The llvm.x86.seh.recoverfp intrinsic takes an EBP value provided by the
EH runtime and returns a pointer that is usable with llvm.framerecover.
The llvm.x86.seh.restoreframe intrinsic is inserted by the 32-bit
specific preparation pass in blocks targetted by the EH runtime. It
re-establishes any physical registers used by the parent function to
address the stack, such as the frame, base, and stack pointers.
Neither of these intrinsics correctly handle stack realignment prologues
yet, but it's possible to add that later.
Reviewers: majnemer
Differential Revision: http://reviews.llvm.org/D10848
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241125 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change introduces a !make.implicit metadata that allows the
frontend to pre-select the set of explicit null checks that will be
considered for transformation into implicit null checks.
The reason for not using profiling data instead of !make.implicit is
explained in the change to `FaultMaps.rst`.
Reviewers: atrick, reames, pgavlin, JosephTremoulet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D10824
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241116 91177308-0d34-0410-b5e6-96231b3b80d8
It is mandatory to specify a comdat in order to receive comdat semantics
for a symbol. We were previously getting this wrong in -function-sections
mode; linker-weak symbols were being emitted in a selectany comdat. This
change causes such symbols to use a noduplicates comdat instead, fixing
the inconsistency.
Also correct an inaccuracy in the docs.
Differential Revision: http://reviews.llvm.org/D10828
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241103 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Really check if %SP is not used in other places, instead of checking only exact
one non-dbg use.
Patched by Xuetian Weng.
Test Plan:
@foo4 in test/CodeGen/NVPTX/local-stack-frame.ll, create a case that
SP will appear twice.
Reviewers: jholewinski, jingyue
Reviewed By: jingyue
Subscribers: llvm-commits, sfantao, jholewinski
Differential Revision: http://reviews.llvm.org/D10844
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241099 91177308-0d34-0410-b5e6-96231b3b80d8
This commit implements serialization of the machine basic block successors. It
uses a YAML flow sequence that contains strings that have the MBB references.
The MBB references in those strings use the same syntax as the MBB machine
operands in the machine instruction strings.
Reviewers: Duncan P. N. Exon Smith
Differential Revision: http://reviews.llvm.org/D10699
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241093 91177308-0d34-0410-b5e6-96231b3b80d8
Duplicating an FP register "as itself" is a bad idea, since it violates the
invariant that every FP register is mapped to at most one FPU stack slot.
Use the scratch FP register instead.
This fixes PR23957.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241069 91177308-0d34-0410-b5e6-96231b3b80d8