This is similar to the getAlignment patch, but is done just for
completeness. It looks like we never call getSection on an alias. All the
tests still pass if the if is replaced with an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208139 91177308-0d34-0410-b5e6-96231b3b80d8
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208129 91177308-0d34-0410-b5e6-96231b3b80d8
This removes arguments passed everywhere and allows the use of
standard iteration over lists.
Should be no functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208127 91177308-0d34-0410-b5e6-96231b3b80d8
Split from the musttail inliner change. This will be covered by an opt
test when the inliner change lands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208126 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When I initially introduced -pass-remarks, I thought it would be a
neat idea to make it additive. So, if one used it as:
$ llc -pass-remarks=inliner --pass-remarks=loop.*
the compiler would build the regular expression '(inliner)|(loop.*)'.
The more I think about it, the more I regret it. This is not how
other flags work. The standard semantics are right-to-left overrides.
This is how clang interprets -Rpass. And I think the two should be
compatible in this respect.
Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3614
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208122 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.
With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).
This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208107 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).
So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208104 91177308-0d34-0410-b5e6-96231b3b80d8
An alias has the address of what it points to, so it also has the same
alignment.
This allows a few optimizations to see past aliases for free.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208103 91177308-0d34-0410-b5e6-96231b3b80d8
fall back to the normal path without a cpu. While doing this fix
llc to just exit when we don't have a module to process instead of
asserting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208102 91177308-0d34-0410-b5e6-96231b3b80d8
The fact that GlobalAlias::setAlignment exists at all is a side effect of
how the classes are organized, it should never be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208094 91177308-0d34-0410-b5e6-96231b3b80d8
Also, provide the ability to create temporary and non-temporary
declarations, as not all declarations may be replaced by definitions
later on.
This provides the necessary infrastructure for Clang to fix PR19598,
leaking temporary MDNodes in Clang's debug info generation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208054 91177308-0d34-0410-b5e6-96231b3b80d8
The Win64 docs are very clear that anything larger than 8 bytes is
passed by reference, and GCC MinGW64 honors that for __modti3 and
friends.
Patch by Jameson Nash!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208029 91177308-0d34-0410-b5e6-96231b3b80d8
On x64, windows.h doesn't include intrin.h for intrinsics. It just
declares them in the global namespace and uses them, expecting the
compiler to lower it as a builtin. We basically need to do this in
clang, eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208023 91177308-0d34-0410-b5e6-96231b3b80d8
The number of tail call to loop conversions remains the same (1618 by my count).
The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208017 91177308-0d34-0410-b5e6-96231b3b80d8
Visibility is meaningless when the linkage is local. Change
`-internalize` to reset the visibility to `default`.
<rdar://problem/16141113>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207979 91177308-0d34-0410-b5e6-96231b3b80d8
Tested that the right -target-cpu is set in the clang -cc1 command line
when running "clang -march=native -E -v - </dev/null" on both an FX-8150
and an FX-8350. Both are family 15h; the FX-8150 (Bulldozer processor)
reports a model number of 1, and the FX-8350 (Piledriver processor)
reports a model number of 2.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207973 91177308-0d34-0410-b5e6-96231b3b80d8
Windows on ARM does not conform to AEABI. However, memset would be emitted
using the AEABI signature, resulting in inverted parameters. Handle this
special case appropriately.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207943 91177308-0d34-0410-b5e6-96231b3b80d8
Add handling for FK_SecRel_4 (4-byte section relative relocations). These are
used by the generation of DWARF debug information (the abbrevations use section
relative relocations). This will also be used in generation of CodeView line
tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207941 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise we use the same threshold as for complete unrolling, which is
way too high. This made us unroll any loop smaller than 150 instructions
by 8 times, but only if someone specified -march=core2 or better,
which happens to be the default on darwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207940 91177308-0d34-0410-b5e6-96231b3b80d8
When can't assume a vectorized tree is rooted in an instruction. The IRBuilder
could have constant folded it. When we rebuild the build_vector (the series of
InsertElement instructions) use the last original InsertElement instruction. The
vectorized tree root is guaranteed to be before it.
Also, we can't assume that the n-th InsertElement inserts the n-th element into
a vector.
This reverts r207746 which reverted the revert of the revert of r205018 or so.
Fixes the test case in PR19621.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207939 91177308-0d34-0410-b5e6-96231b3b80d8
operations on the call graph. This one forms a cycle, and while not as
complex as removing an internal edge from an SCC, it involves
a reasonable amount of work to find all of the nodes newly connected in
a cycle.
Also somewhat alarming is the worst case complexity here: it might have
to walk roughly the entire SCC inverse DAG to insert a single edge. This
is carefully documented in the API (I hope).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207935 91177308-0d34-0410-b5e6-96231b3b80d8
Both MinGW and cygwin (i686) construct export directives without the global
leader prefix. This is mostly due to the fact that they use GNU ld which does
not correctly handle the export directive. This apparently has been been broken
for a while. However, this was recently reported as being broken by
mingwandroid and diorcety of the msys2 project.
Remove the global leader prefix if targeting MinGW or cygwin, otherwise, retain
the global leader prefix. Add an explicit test for cygwin's behaviour of export
directives.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207926 91177308-0d34-0410-b5e6-96231b3b80d8
Create a helper function to generate the export directive. This was previously
duplicated inline to handle export directives for variables and functions. This
also enables the use of range-based iterators for the generation of the
directive rather than the traditional loops. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207925 91177308-0d34-0410-b5e6-96231b3b80d8
This fix simply ensures that both metadata nodes are path-aware before
performing path-aware alias analysis.
This issue isn't normally triggered in LLVM, because we perform an autoupgrade
of the TBAA metadata to the new format when reading in LL or BC files. This
issue only appears when a client creates the IR manually and mixes old and new
TBAA metadata format.
This fixes <rdar://problem/16760860>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207923 91177308-0d34-0410-b5e6-96231b3b80d8
The fix itself is fairly simple: move getAccessVariant to MCValue so that we
replace the old weak expression evaluation with the far more general
EvaluateAsRelocatable.
This then requires that EvaluateAsRelocatable stop when it finds a non
trivial reference kind. And that in turn requires the ELF writer to look
harder for weak references.
Last but not least, this found a case where we were being bug by bug
compatible with gas and accepting an invalid input. I reported pr19647
to track it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207920 91177308-0d34-0410-b5e6-96231b3b80d8
There is no point in creating it if we're not going to vectorize
anything. Creating the map is expensive as it creates large values.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207916 91177308-0d34-0410-b5e6-96231b3b80d8
which are corresponding to the current target read from the ELF file.
This fix cannot be tested until obj2yaml does not support ELF format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207905 91177308-0d34-0410-b5e6-96231b3b80d8
Committed initially in r207724-r207726 and reverted due to compiler-rt
crashes in r207732.
Instead, fix this harder with unordered_map and store the LexicalScopes
by value in the map. This did necessitate moving the definition of
LexicalScope above the definition of LexicalScopes.
Let's see how the buildbots/compilers tolerate unordered_map::emplace +
std::piecewise_construct + std::forward_as_tuple...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207876 91177308-0d34-0410-b5e6-96231b3b80d8
There are public functions that mutate various members as well as
another private member already, so make all the members private to
avoid the discontinuity and add accessors for the values. Should
be no functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207868 91177308-0d34-0410-b5e6-96231b3b80d8
Reading line tables in llvm-cov was pretty broken, but would happen to
work as long as no line in the table was 0. It's not clear to me
whether a line of zero *should* show up in these tables, but deciding
to read a string in the middle of the line table is certainly the
wrong thing to do if it does.
I've also added some comments, as trying to figure out what this block
of code was doing was fairly unpleasant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207866 91177308-0d34-0410-b5e6-96231b3b80d8
This moves most of GlobalOpt's constructor optimization
code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The
public interface is a single function OptimizeGlobalCtorsList() that
takes a predicate returning which constructors to remove.
GlobalOpt calls this with a function that statically evaluates all
constructors, just like it did before. This part of the change is
behavior-preserving.
Also add a call to this from GlobalDCE with a filter that removes global
constructors that contain a "ret" instruction and nothing else – this
fixes PR19590.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207856 91177308-0d34-0410-b5e6-96231b3b80d8
address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where
PRE is applied to a load that is not partially redundant.
<rdar://problem/16638765>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207853 91177308-0d34-0410-b5e6-96231b3b80d8
.file records are supposed to have a section identifier of 65534
(IMAGE_SCN_DEBUG) rather than 0. This is spelt out clearly within the PE/COFF
specification. Fix this minor oversight with the implementation for support for
.file records.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207851 91177308-0d34-0410-b5e6-96231b3b80d8
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Signed-off-by: Samuel Li <samuel.li@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207846 91177308-0d34-0410-b5e6-96231b3b80d8
v2: move code to AMDGPUISelLowering.cpp
squash with tests (both EG and SI)
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207845 91177308-0d34-0410-b5e6-96231b3b80d8
The register spiller assumes that only one new instruction is created
when spilling and restoring registers, so we need to emit pseudo
instructions for vector register spills and lower them after
register allocation.
v2:
- Fix calculation of lane index
- Extend VGPR liveness to end of program.
v3:
- Use SIMM16 field of S_NOP to specify multiple NOPs.
https://bugs.freedesktop.org/show_bug.cgi?id=75005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207843 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, LLVM had no knowledge that these instructions actually
modified their address register: fine if they never end up in CodeGen,
but when I'd rather like to write some patterns for them it becomes a
disaster.
The change is mostly straightforward, I think the most significant
design decision was to *always* put the address write-back first. This
allows loads and stores to be accessed more uniformly, for example
permitting the continued sharing of the InstAlias definitions.
I also discovered that the custom Decode logic is no longer needed, so
I removed it.
No tests, because there should be no functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207839 91177308-0d34-0410-b5e6-96231b3b80d8
While post-indexed LD1/ST1 instructions do exist for vector loads,
this patch makes use of the more flexible addressing-modes in LDR/STR
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207838 91177308-0d34-0410-b5e6-96231b3b80d8
Given the following C code llvm currently generates suboptimal code for
x86-64:
__m128 bss4( const __m128 *ptr, size_t i, size_t j )
{
float f = ptr[i][j];
return (__m128) { f, f, f, f };
}
=================================================
define <4 x float> @_Z4bss4PKDv4_fmm(<4 x float>* nocapture readonly %ptr, i64 %i, i64 %j) #0 {
%a1 = getelementptr inbounds <4 x float>* %ptr, i64 %i
%a2 = load <4 x float>* %a1, align 16, !tbaa !1
%a3 = trunc i64 %j to i32
%a4 = extractelement <4 x float> %a2, i32 %a3
%a5 = insertelement <4 x float> undef, float %a4, i32 0
%a6 = insertelement <4 x float> %a5, float %a4, i32 1
%a7 = insertelement <4 x float> %a6, float %a4, i32 2
%a8 = insertelement <4 x float> %a7, float %a4, i32 3
ret <4 x float> %a8
}
=================================================
shlq $4, %rsi
addq %rdi, %rsi
movslq %edx, %rax
vbroadcastss (%rsi,%rax,4), %xmm0
retq
=================================================
The movslq is uneeded, but is present because of the trunc to i32 and then
sext back to i64 that the backend adds for vbroadcastss.
We can't remove it because it changes the meaning. The IR that clang
generates is already suboptimal. What clang really should emit is:
%a4 = extractelement <4 x float> %a2, i64 %j
This patch makes that legal. A separate patch will teach clang to do it.
Differential Revision: http://reviews.llvm.org/D3519
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207801 91177308-0d34-0410-b5e6-96231b3b80d8
This creates a lot of core infrastructure in which to add, with little
effort, quite a bit more to mips fast-isel
Test Plan: simplestore.ll
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3527
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207790 91177308-0d34-0410-b5e6-96231b3b80d8
This was initialized by llvm-mc (calling setDwarfVersion) but other
clients (such as clang, llc, etc) aren't necessarily initializing this
so we were getting garbage DWARF version values in the output.
Initialize it to a reasonable default (the same default used in llvm-mc,
though this is higher than it was (2) previously).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207788 91177308-0d34-0410-b5e6-96231b3b80d8
This matches gas' behaviour on COFF.
I think that this yak is now sufficiently shaved for aliases with offset
to work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207786 91177308-0d34-0410-b5e6-96231b3b80d8
This optimization merges the common part of a group of GEPs, so we can compute
each pointer address by adding a simple offset to the common part.
The optimization is currently only enabled for the NVPTX backend, where it has
a large payoff on some benchmarks.
Review: http://reviews.llvm.org/D3462
Patch by Jingyue Wu.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207783 91177308-0d34-0410-b5e6-96231b3b80d8
This the LLVM portion that will allow Clang and other frontends to emit
typedefs of void by providing a null type for the typedef's underlying
type.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207777 91177308-0d34-0410-b5e6-96231b3b80d8
Use i32 instead of specifying SReg_32. When this is
the pseudo INDIRECT_BASE_ADDR, this would give a bogus
verifier error.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207770 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes pr19147.
There are a few more related issues to fix, but the testcase in the bug now
passes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207763 91177308-0d34-0410-b5e6-96231b3b80d8
We currently force symbols to be globals in .thumb_set. The intent
seems to be that given
.thumb_set foo, bar
we emit an undefined symbol to bar if it is never defined. The side
effect is that we mark bar as global, even if it is defined, which gas
does not.
Producing an undefined reference to bar is a general difference from MC and gas.
For example, given
a = b
gas will produce an undefined reference to b, MC will not. I would be surprised
if any code depends on this, but it it does, we should fix the general
difference, not special case .thumb_set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207757 91177308-0d34-0410-b5e6-96231b3b80d8
The canonical form of the BFM instruction is always one of the more explicit
extract or insert operations, which makes reading output much easier.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207752 91177308-0d34-0410-b5e6-96231b3b80d8
just connects an SCC to one of its descendants directly. Not much of an
impact. The last one is the hard one -- connecting an SCC to one of its
ancestors, and thereby forming a cycle such that we have to merge all
the SCCs participating in the cycle.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207751 91177308-0d34-0410-b5e6-96231b3b80d8
of SCCs in the SCC DAG. Exercise them in the big graph test case. These
will be especially useful for establishing invariants in insertion
logic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207749 91177308-0d34-0410-b5e6-96231b3b80d8
=[
Turns out that this was the root cause of PR19621. We found a crasher
only recently (likely due to improvements elsewhere in the SLP
vectorizer) but the reduced test case failed all the way back to here.
I've confirmed that reverting this patch both fixes the reduced test
case in PR19621 and the actual source file that led to it, so it seems
to really be rooted here. I've replied to the commit thread with
discussion of my (feeble) attempts to debug this. Didn't make it very
far, so reverting now that we have a good test case so that things can
get back to healthy while the debugging carries on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207746 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There are two functional changes:
1) The directive is not expanded for the ASM->ASM code path.
2) If PIC is not set, there's no expansion for the ASM->OBJ code path (same behaviour as GAS).
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3482
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207741 91177308-0d34-0410-b5e6-96231b3b80d8
GAS doesn't actually accept these particular cases.
The mnemonic without the trailing 'v' still supports two-operand aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207740 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes the memory leak introduced with the initial addition of support for
WoA stack probing. Now that the pseudo-instruction expansion can handle an
external symbol, use that to generate the load which simplifies the logic as
well as avoids the memory leak.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207737 91177308-0d34-0410-b5e6-96231b3b80d8
This enhances the expansion of the mov32imm pseudo-instruction to support an
external symbol reference. This is motivated by a simplification of the stack
probe emission for Windows on ARM (and fixing a leak).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207736 91177308-0d34-0410-b5e6-96231b3b80d8
This makes the coff writer compute the correct symbol value for the test in
pr19147. The section is still incorrect, that will be fixed in a followup patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207728 91177308-0d34-0410-b5e6-96231b3b80d8
Breaks GDB buildbot
(http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/14517)
GCC emits DW_AT_object_pointer /everywhere/ (declaration, abstract
definition, inlined subroutine), but it looks like GCC relies on it
being somewhere other than the declaration, at least. I'll experiment
further & can hopefully still remove it from the inlined_subroutine.
This reverts commit r207705.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207719 91177308-0d34-0410-b5e6-96231b3b80d8
They just don't need to be there - they're inherited from the abstract
definition. In theory I would like them to be inherited from the
declaration, but the DWARF standard doesn't quite say that... we can
probably do it anyway but I'm less confident about that so I'll leave it
for a separate commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207717 91177308-0d34-0410-b5e6-96231b3b80d8
Not all address taken blocks get inlined. The reason is
that a blocks new address is known only when it is cloned. But e.g.
a branch instruction in a different block could need that address earlier
while it gets cloned. The solution is to collect the set of all
blocks that can potentially get inlined and compute a new block address
up front. Then clone and cleanup.
rdar://16427209
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207713 91177308-0d34-0410-b5e6-96231b3b80d8
This simplifies ELFObjectWriter::SymbolValue a bit more. This new version
will also be used in the COFF writer to fix pr19147.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207711 91177308-0d34-0410-b5e6-96231b3b80d8
This effectively reverts r164326, but adds some comments and
justification and ensures we /don't/ emit the DW_AT_object_pointer on
the (abstract and concrete) definitions. (while still preserving it on
standalone definitions involving ObjC Blocks)
This does increase the size of member function declarations from 7 to 11
bytes, unfortunately, but still seems like the Right Thing to do so that
callers that see only the declaration still have the information about
the object pointer. That said, I don't know what, if any, DWARF
consumers don't have a heuristic to guess this in the case of normal
C++ member functions - perhaps we can remove it entirely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207705 91177308-0d34-0410-b5e6-96231b3b80d8
For pattern like ((x >> C1) & Mask) << C2, DAG combiner may convert it
into (x >> (C1-C2)) & (Mask << C2), which makes pattern matching of ubfx
more difficult.
For example:
Given
%shr = lshr i64 %x, 4
%and = and i64 %shr, 15
%arrayidx = getelementptr inbounds [8 x [64 x i64]]* @arr, i64 0, %i64 2, i64 %and
%0 = load i64* %arrayidx
With current shift folding, it takes 3 instrs to compute base address:
lsr x8, x0, #1
and x8, x8, #0x78
add x8, x9, x8
If using ubfx, it only needs 2 instrs:
ubfx x8, x0, #4, #4
add x8, x9, x8, lsl #3
This fixes bug 19589
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207702 91177308-0d34-0410-b5e6-96231b3b80d8
DwarfDebug.h has a SmallVector member containing a unique_ptr of an
incomplete type. MSVC doesn't have key functions, so the vtable and
dtor are emitted in AsmPrinter.cpp, where DwarfDebug's ctor is called.
AsmPrinter.cpp include DwarfUnit.h and doesn't get a complete definition
of DwarfTypeUnit. We could fix the problem by including DwarfUnit.h in
DwarfDebug.h, but that would increase header bloat. Instead, define
~DwarfDebug out of line.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207701 91177308-0d34-0410-b5e6-96231b3b80d8
There is no need to check if we want to hoist the immediate value of an
shift instruction. Simply return TCC_Free right away.
This change is like r206101, but for X86.
rdar://problem/16190769
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207692 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The pattern sltu $r1, $r2, $imm is found in handwritten assembly which
is just a shorthand version of sltui $r1, $r2, $imm.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3508
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207671 91177308-0d34-0410-b5e6-96231b3b80d8
We already do this for shstrtab, so might as well do it for strtab. This
extracts the string table building code into a separate class. The idea
is to use it for other object formats too.
I mostly wanted to do this for the general principle, but it does save a
little bit on object file size. I tried this on a clang bootstrap and
saved 0.54% on the sum of object file sizes (1.14 MB out of 212 MB for
a release build).
Differential Revision: http://reviews.llvm.org/D3533
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207670 91177308-0d34-0410-b5e6-96231b3b80d8
It's been decided that in the future, the floating-point immediate in
instructions like "fcmeq v0.2s, v1.2s, #0.0" will be canonically "0.0", which
has been implemented on AArch64 already but not ARM64.
This fixes that issue.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207666 91177308-0d34-0410-b5e6-96231b3b80d8
Pretty straightforward, we weren't propagating whether or not an
AllocaInst had 'inalloca' marked on it when it came time to clone it.
The inliner exposed this bug. A reduced testcase is forthcoming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207665 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The pattern dsll/dsrl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of dsllv/dsrlv $rd, $rt, $rs.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3486
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207664 91177308-0d34-0410-b5e6-96231b3b80d8
We can't use SALU instructions for this since they ignore the EXEC mask
and are always executed.
This fixes several OpenCV tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207661 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The pattern sll/srl $rd, $rt, $rs is found in handwritten assembly which
is just a shorthand version of sllv/srlv $rd, $rt, $rs.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207657 91177308-0d34-0410-b5e6-96231b3b80d8
target cannot be determined accurately. This is the case for NaCl where the
sandboxing instructions are added in MC layer, after the MipsLongBranch pass.
It is also the case when the code has inline assembly. Instead of calculating
offset in the MipsLongBranch pass, use %hi(sym1 - sym2) and %lo(sym1 - sym2)
expressions that are resolved during the fixup.
This patch also deletes microMIPS test file test/CodeGen/Mips/micromips-long-branch.ll
and implements microMIPS CHECKs in a much simpler way in a file
test/CodeGen/Mips/longbranch.ll, together with MIPS32 and MIPS64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207656 91177308-0d34-0410-b5e6-96231b3b80d8
Only emit calls to compiler-rt asm routines on platforms where they are
present (currently limited to linux i386/x86_64).
Patch by Yuri Gorshenin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207651 91177308-0d34-0410-b5e6-96231b3b80d8
The canonical syntax for shifts by a variable amount does not end with 'v', but
that syntax should be supported as an alias (presumably for legacy reasons).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207649 91177308-0d34-0410-b5e6-96231b3b80d8
AArch64 does not have a CPSR register in the same way that AArch32 does. Most
of its compiler-relevant roles have been taken over by the more specific NZCV
register (representing just the flags set by normal instructions).
Its system control functions still remain, but are now under the
pseudo-register referred to as "PSTATE". They're accessed via various MRS & MSR
instructions described in the reference manual.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207645 91177308-0d34-0410-b5e6-96231b3b80d8
On instructions using the NZCV register, a couple of conditions have dual
representations: HS/CS and LO/CC (meaning unsigned-higher-or-same/carry-set and
unsigned-lower/carry-clear). The first of these is more descriptive in most
circumstances, so we should print it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207644 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This isn't supported directly so we rotate the vector by the desired number of
elements, insert to element zero, then rotate back.
The i64 case generates rather poor code on MIPS32. There is an obvious
optimisation to be made in future (do both insert.w's inside a shared
rotate/unrotate sequence) but for now it's sufficient to select valid code
instead of aborting.
Depends on D3536
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://reviews.llvm.org/D3537
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207640 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This directive is used for setting up $gp in the beginning of a function.
It expands to three instructions if PIC is enabled:
lui $gp, %hi(_gp_disp)
addui $gp, $gp, %lo(_gp_disp)
addu $gp, $gp, $reg
_gp_disp is a special symbol that the linker sets to the distance between
the lui instruction and the context pointer (_gp).
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3480
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207637 91177308-0d34-0410-b5e6-96231b3b80d8
Since these are mostly used in "lsl #16", "lsl #32", "lsl #48" combinations to
piece together an immediate in 16-bit chunks, hex is probably the most
appropriate format.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207635 91177308-0d34-0410-b5e6-96231b3b80d8
This is mostly aimed at the NEON logical operations and MOVI/MVNI (since they
accept weird shifts which are more naturally understandable in hex notation).
Also changes BRK/HINT etc, which is probably a neutral change, but easier than
the alternative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207634 91177308-0d34-0410-b5e6-96231b3b80d8
Since these instructions only accept a 12-bit immediate, possibly shifted left
by 12, the canonical syntax used by the architecture reference manual is "#N {,
lsl #12 }". We should accept an immediate that has already been shifted, (e.g.
Also, print a comment giving the full addend since it can be helpful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207633 91177308-0d34-0410-b5e6-96231b3b80d8
edge entirely within an existing SCC. Shockingly, making the connected
component more connected is ... a total snooze fest. =]
Anyways, its wired up, and I even added a test case to make sure it
pretty much sorta works. =D
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207631 91177308-0d34-0410-b5e6-96231b3b80d8
A bunch of switch cases were missing, not just for ARM64 but also for
AArch64_BE. I've fixed all those, but there's zero testing as
ExecutionEngine tests are disabled when crosscompiling and I don't
have a native platform available to test on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207626 91177308-0d34-0410-b5e6-96231b3b80d8
This is a partial port of r204816 (cpirker "Elf support for MC-JIT
runtime dynamic linker") from AArch64 to ARM64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207625 91177308-0d34-0410-b5e6-96231b3b80d8
bits), and discover that it's totally broken. Yay tests. Boo bug. Fix
the basic edge removal so that it works by nulling out the removed edges
rather than actually removing them. This leaves the indices valid in the
map from callee to index, and preserves some of the locality for
iterating over edges. The iterator is made bidirectional to reflect that
it now has to skip over null entries, and the skipping logic is layered
onto it.
As future work, I would like to track essentially the "load factor" of
the edge list, and when it falls below a threshold do a compaction.
An alternative I considered (and continue to consider) is storing the
callees in a doubly linked list where each element of the list is in
a set (which is essentially the classical linked-hash-table
datastructure). The problem with that approach is that either you need
to heap allocate the linked list nodes and use pointers to them, or use
a bucket hash table (with even *more* linked list pointer overhead!),
etc. It's pretty easy to get 5x overhead for values that are just
pointers. So far, I think punching holes in the vector, and periodic
compaction is likely to be much more efficient overall in the space/time
tradeoff.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207619 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces the stack lowering emission of the stack probe function for
Windows on ARM. The stack on Windows on ARM is a dynamically paged stack where
any page allocation which crosses a page boundary of the following guard page
will cause a page fault. This page fault must be handled by the kernel to
ensure that the page is faulted in. If this does not occur and a write access
any memory beyond that, the page fault will go unserviced, resulting in an
abnormal program termination.
The watermark for the stack probe appears to be at 4080 bytes (for
accommodating the stack guard canaries and stack alignment) when SSP is
enabled. Otherwise, the stack probe is emitted on the page size boundary of
4096 bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207615 91177308-0d34-0410-b5e6-96231b3b80d8
Emit the COFF header when printing out the function. This is important as the
header contains two important pieces of information: the storage class for the
symbol and the symbol type information. This bit of information is required for
the linker to correctly identify the type of symbol that it is dealing with.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207613 91177308-0d34-0410-b5e6-96231b3b80d8
When building with -Werror=covered-switch-default (as on the buildbots), the
build would fail since all cases are covered by the switch. Move the
llvm_unreachable to the end of the function as an annotation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207609 91177308-0d34-0410-b5e6-96231b3b80d8
IMAGE_REL_ARM_MOV32T relocations require that the movw/movt pair-wise
relocation is not split up and reordered. When expanding the mov32imm
pseudo-instruction, create a bundle if the machine operand is referencing an
address. This helps ensure that the relocatable address load is not reordered
by subsequent passes.
Unfortunately, this only partially handles the case as the Constant Island Pass
occurs after the instructions are unbundled and does not properly handle
bundles. That is a more fundamental issue with the pass itself and beyond the
scope of this change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207608 91177308-0d34-0410-b5e6-96231b3b80d8
Streamline parsing and dumping line tables:
Prefer composition to multiple inheritance in DWARFDebugLine::ParsingState.
Get rid of the weird concept of "DumpingState" structure.
was:
DWARFDebugLine::DumpingState state(OS);
DWARFDebugLine::parseStatementTable(..., state);
now:
DWARFDebugLine::LineTable LineTable;
LineTable.parse(...);
LineTable.dump(OS);
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207599 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, musttail codegen is relying on sibcall optimization, and
reporting a fatal error if fails. Sibcall optimization fails when stack
arguments need to be modified, which is insufficient for musttail.
The logic for moving arguments in memory safely is already implemented
for GuaranteedTailCallOpt. This change merely arranges for musttail
calls to use it.
No functional change for GuaranteedTailCallOpt.
Reviewers: espindola
Differential Revision: http://reviews.llvm.org/D3493
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207598 91177308-0d34-0410-b5e6-96231b3b80d8
MSVC 2013 provides std::make_unique, which it finds with ADL when one of
the parameters is std::unique_ptr, leading to an ambiguous overload.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207597 91177308-0d34-0410-b5e6-96231b3b80d8
It's already set in AMDGPUISelLowering for all GPUs
Patch By: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207592 91177308-0d34-0410-b5e6-96231b3b80d8
SI_IF and SI_ELSE are terminators which also produce a value. For
these instructions ISel always inserts a COPY to move their value
to another basic block. This COPY ends up between SI_(IF|ELSE)
and the S_BRANCH* instruction at the end of the block.
This breaks MachineBasicBlock::getFirstTerminator() and also the
machine verifier which assumes that terminators are grouped together at
the end of blocks.
To solve this we coalesce the copy away right after ISel to make sure
there are no instructions in between terminators at the end of blocks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207591 91177308-0d34-0410-b5e6-96231b3b80d8
SALU instructions ignore control flow, so it is not always safe to use
them within branches. This is a partial solution to this problem
until we can come up with something better.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207590 91177308-0d34-0410-b5e6-96231b3b80d8
This is a squash of several optimization commits:
- calculate DIV_Lo and DIV_Hi separately
- use BFE_U32 if we are operating on 32bit values
- use precomputed constants instead of shifting in UDVIREM
- skip the first 32 iterations of udivrem
v2: Check whether BFE is supported before using it
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207589 91177308-0d34-0410-b5e6-96231b3b80d8
Initial implementation, rather slow
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207588 91177308-0d34-0410-b5e6-96231b3b80d8
When legalizing ops, with UDIV/UREM set to expand, they automatically
expand to UDIVREM (if legal or custom).
We need to do this manually for legalize types.
v2:
SI should be set to Expand because the type is legal, and it is
automatically lowered to UDIVREM if UDIVREM is Legal/Custom
R600 should set to UDIV/UREM to Custom because it needs to lower them
during type legalization
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207587 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by: Jan Vesely
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207586 91177308-0d34-0410-b5e6-96231b3b80d8
Seems MSVC wants to be able to codegen inline-definitions of virtual
functions even in TUs that don't define the key function - and it's well
within its rights to do so.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207581 91177308-0d34-0410-b5e6-96231b3b80d8
This starts in MCJIT::getSymbolAddress where the
unique_ptr<object::Binary> is release()d and (after a cast) passed to a
single caller, MCJIT::addObjectFile.
addObjectFile calls RuntimeDyld::loadObject.
RuntimeDld::loadObject calls RuntimeDyldELF::createObjectFromFile
And the pointer is never owned at this point. I say this point, because
the alternative codepath, RuntimeDyldMachO::createObjectFile certainly
does take ownership, so this seemed like a good hint that this was a/the
right place to take ownership.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207580 91177308-0d34-0410-b5e6-96231b3b80d8
Move several function definitions into .cpp, unify constructors
and clear() methods (fixing a couple of latent bugs from copy-paste),
turn static function parsePrologue() into Prologue::parse().
More work needed here to untangle weird multiple inheritance
in table parsing and dumping.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207579 91177308-0d34-0410-b5e6-96231b3b80d8
The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect.
The _256 variants have indexes into the individual 128 bit lanes and in all
cases it also has to mask out unused bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207577 91177308-0d34-0410-b5e6-96231b3b80d8
This patch changes the vectorization remarks to also inform when
vectorization is possible but not beneficial.
Added tests to exercise some loop remarks.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207574 91177308-0d34-0410-b5e6-96231b3b80d8
DIE doesn't need to store a pointer to its parent: we can traverse the DIE tree
only with functions getFirstChild() and getSibling(). Parents must be known
only when we construct the tree. Rewrite setDIERelations() procedure in a more
straightforward way, and get rid of lots of now unused DIEMinimal methods.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207563 91177308-0d34-0410-b5e6-96231b3b80d8
Change `BlockFrequency` to defer to `BranchProbability::scale()` and
`BranchProbability::scaleByInverse()`.
This removes `BlockFrequency::scale()` from its API (and drops the
ability to see the remainder), but the only user was the unit tests. If
some code in the future needs an API that exposes the remainder, we can
add something to `BranchProbability`, but I find that unlikely.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207550 91177308-0d34-0410-b5e6-96231b3b80d8
Add API to `BranchProbability` for scaling big integers. Next job is to
rip the logic out of `BlockMass` and `BlockFrequency`.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207544 91177308-0d34-0410-b5e6-96231b3b80d8
These were called from distinct places and had significant distinct
behavior. No need to make that a dynamic check inside the function
rather than just having two functions (refactoring some common code into
a helper function to be called from the two separate functions).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207539 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This calls emitOptimizationRemark from the loop unroller and vectorizer
at the point where they make a positive transformation. For the
vectorizer, it reports vectorization and interleave factors. For the
loop unroller, it reports all the different supported types of
unrolling.
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207528 91177308-0d34-0410-b5e6-96231b3b80d8
This patch centralizes the handling of the thumb bit around
MCStreamer::isThumbFunc and makes isThumbFunc handle aliases.
This fixes a corner case, but the main advantage is having just one
way to check if a MCSymbol is thumb or not. This should still be
refactored to be ARM only, but at least now it is just one predicate
that has to be refactored instead of 3 (isThumbFunc,
ELF_Other_ThumbFunc, and SF_ThumbFunc).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207522 91177308-0d34-0410-b5e6-96231b3b80d8
There are no patterns for this. This was already fixed for ARM64 but I forgot
to apply it to AArch64 too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207515 91177308-0d34-0410-b5e6-96231b3b80d8