Handle chains in which the same offset is used for both loads and
stores to the same array.
Fixes rdar://11410078.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174789 91177308-0d34-0410-b5e6-96231b3b80d8
same so we put in the comment field an indicator when we think we are
emitting the 16 bit version. For the direct object emitter, the difference is
important as well as for other passes which need an accurate count of
program size. There will be other similar putbacks to this for various
instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174747 91177308-0d34-0410-b5e6-96231b3b80d8
Previously, even when a pre-increment load or store was generated,
we often needed to keep a copy of the original base register for use
with other offsets. If all of these offsets are constants (including
the offset which was combined into the addressing mode), then this is
clearly unnecessary. This change adjusts these other offsets to use the
new incremented address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174746 91177308-0d34-0410-b5e6-96231b3b80d8
Aside from the question of whether we report a warning or an error when we
can't satisfy a requested stack object alignment, the current implementation
of this is not good. We're not providing any source location in the diagnostics
and the current warning is not connected to any warning group so you can't
control it. We could improve the source location somewhat, but we can do a
much better job if this check is implemented in the front-end, so let's do that
instead. <rdar://problem/13127907>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174741 91177308-0d34-0410-b5e6-96231b3b80d8
Thanks to help from Nadav and Hal, I have a more reasonable (and even
correct!) approach. This specifically penalizes the insertelement
and extractelement operations for the performance hit that will occur
on PowerPC processors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174725 91177308-0d34-0410-b5e6-96231b3b80d8
isn't using the default calling convention. However, if the transformation is
from a call to inline IR, then the calling convention doesn't matter.
rdar://13157990
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174724 91177308-0d34-0410-b5e6-96231b3b80d8
Adds a function to target transform info to query for the cost of address
computation. The cost model analysis pass now also queries this interface.
The code in LoopVectorize adds the cost of address computation as part of the
memory instruction cost calculation. Only there, we know whether the instruction
will be scalarized or not.
Increase the penality for inserting in to D registers on swift. This becomes
necessary because we now always assume that address computation has a cost and
three is a closer value to the architecture.
radar://13097204
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174713 91177308-0d34-0410-b5e6-96231b3b80d8
allowed size for the instruction. This code uses RegScavenger to fix this.
We sometimes need 2 registers for Mips16 so we must handle things
differently than how register scavenger is normally used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174696 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions compare two floating point values and return an
integer true (-1) or false (0) value.
When compiling code generated by the Mesa GLSL frontend, the SET*_DX10
instructions save us four instructions for most branch decisions that
use floating-point comparisons.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174609 91177308-0d34-0410-b5e6-96231b3b80d8
The test is a binary placed in test/DebugInfo/Inputs, with a source C
file used for reference/reproducing. The source's first line is a clang
build command for reproducing the binary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174543 91177308-0d34-0410-b5e6-96231b3b80d8
account. Atoms use LEA for updating SP in prologs/epilogs, and the
exact LEA opcode depends on the data model.
Also reapplying the test case which was added and then reverted
(because of Atom failures), this time specifying explicitly the CPU in
addition to the triple. The test case now checks all variations (data
mode, cpu Atom vs. Core).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174542 91177308-0d34-0410-b5e6-96231b3b80d8
Weakly defined symbols should evaluate to 0 if they're undefined at
link-time. This is impossible to do with the usual address generation
patterns, so we should use a literal pool entry to materlialise the
address.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174518 91177308-0d34-0410-b5e6-96231b3b80d8
These instructions are a late addition to the architecture, and may
yet end up behind an optional attribute, but for now they're available
at all times.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174496 91177308-0d34-0410-b5e6-96231b3b80d8
This adds hints to the various "prfm" instructions so that they can
affect the instruction cache as well as the data cache.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174495 91177308-0d34-0410-b5e6-96231b3b80d8
Failure: undefined symbol 'Lline_table_start0'.
Root-cause: we use a symbol subtraction to calculate at_stmt_list, but
the line table entries are not dumped in the assembly.
Fix: use zero instead of a symbol subtraction for Compile Unit 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174479 91177308-0d34-0410-b5e6-96231b3b80d8
We generate one line table for each compilation unit in the object file.
Reviewed by Eric and Kevin.
rdar://problem/13067005
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174445 91177308-0d34-0410-b5e6-96231b3b80d8
is a vararg function.
The original code was examining flag OutputArg::IsFixed to determine whether
CC_MipsN_VarArg or CC_MipsN should be called. This is not correct, since this
flag is often set to false when the function being analyzed is a non-variadic
function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174442 91177308-0d34-0410-b5e6-96231b3b80d8
base point of a load, and the overall alignment of the load. This caused infinite loops in DAG combine with the
original application of this patch.
ORIGINAL COMMIT LOG:
When the target-independent DAGCombiner inferred a higher alignment for a load,
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174431 91177308-0d34-0410-b5e6-96231b3b80d8
Currently, when a fragment is relaxed, its size is modified, but its
offset is not (it gets laid out as a side effect of checking whether
it needs relaxation), then all subsequent fragments are invalidated
because their offsets need to change. When bundling is enabled,
relaxed fragments need to get laid out again, because the increase in
size may push it over a bundle boundary. So instead of only
invalidating subsequent fragments, also invalidate the fragment that
gets relaxed, which causes it to get laid out again.
This patch also fixes some trailing whitespace and fixes the
bundling-related debug output of MCFragments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174401 91177308-0d34-0410-b5e6-96231b3b80d8
Emitting the function name allows us to check for it in the FileCheck
tests so we can make sure FileCheck is checking the output of the
correct function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174392 91177308-0d34-0410-b5e6-96231b3b80d8
In the loop vectorizer cost model, we used to ignore stores/loads of a pointer
type when computing the widest type within a loop. This meant that if we had
only stores/loads of pointers in a loop we would return a widest type of 8bits
(instead of 32 or 64 bit) and therefore a vector factor that was too big.
Now, if we see a consecutive store/load of pointers we use the size of a pointer
(from data layout).
This problem occured in SingleSource/Benchmarks/Shootout-C++/hash.cpp (reduced
test case is the first test in vector_ptr_load_store.ll).
radar://13139343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174377 91177308-0d34-0410-b5e6-96231b3b80d8
It caused hangups in compiling clang/lib/Parse/ParseDecl.cpp and clang/lib/Driver/Tools.cpp in stage2 on some hosts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174374 91177308-0d34-0410-b5e6-96231b3b80d8
The sh_link in the ELF section header of .ARM.exidx should
be filled with the section index of the corresponding text
section.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174372 91177308-0d34-0410-b5e6-96231b3b80d8
and enables the instruction printer to print aliased
instructions.
Due to usage of RegisterOperands a change in common
code (utils/TableGen/AsmWriterEmitter.cpp) is required
to get the correct register value if it is a RegisterOperand.
Contributer: Vladimir Medic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174358 91177308-0d34-0410-b5e6-96231b3b80d8
it would replace the load with one with the higher alignment. However, it did
not place the new load in the worklist, which prevented later DAG combines in
the same phase (for example, target-specific combines) from ever seeing it.
This patch corrects that oversight, and updates some tests whose output changed
due to slightly different DAGCombine outputs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174343 91177308-0d34-0410-b5e6-96231b3b80d8
Per discussion in rdar://13127907, we should emit a hard error only if
people write code where the requested alignment is larger than achievable
and assumes the low bits are zeros. A warning should be good enough when
we are not sure if the source code assumes the low bits are zeros.
rdar://13127907
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174336 91177308-0d34-0410-b5e6-96231b3b80d8
I didn't see those because the test case used "not grep". FileCheck the test and
XFAIL it, preserving the old optimization, so this can be fixed eventually.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174330 91177308-0d34-0410-b5e6-96231b3b80d8
This required disabling a PowerPC optimization that did the following:
input:
x = BUILD_VECTOR <i32 16, i32 16, i32 16, i32 16>
lowered to:
tmp = BUILD_VECTOR <i32 8, i32 8, i32 8, i32 8>
x = ADD tmp, tmp
The add now gets folded immediately and we're back at the BUILD_VECTOR we
started from. I don't see a way to fix this currently so I left it disabled
for now.
Fix some trivially foldable X86 tests too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174325 91177308-0d34-0410-b5e6-96231b3b80d8
Swift has a renaming dependency if we load into D subregisters. We don't have a
way of distinguishing between insertelement operations of values from loads and
other values. Therefore, we are pessimistic for now (The performance problem
showed up in example 14 of gcc-loops).
radar://13096933
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174300 91177308-0d34-0410-b5e6-96231b3b80d8
The main lists of debug info metadata attached to the compile_unit had an extra
layer of metadata nodes they went through for no apparent reason. This patch
removes that (& still passes just as much of the GDB 7.5 test suite). If anyone
can show evidence as to why these extra metadata nodes are there I'm open to
reverting this patch & documenting why they're there.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174266 91177308-0d34-0410-b5e6-96231b3b80d8
1) allows the use of RIP-relative addressing in 32-bit LEA instructions under
x86-64 (ILP32 and LP64)
2) separates the size of address registers in 64-bit LEA instructions from
control by ILP32/LP64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174208 91177308-0d34-0410-b5e6-96231b3b80d8
Prepare it for vectors of pointers and handle simple cases. We don't handle
complicated cases because accumulateConstantOffset bails on pointer vectors.
Fixes selfhost on i386.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174179 91177308-0d34-0410-b5e6-96231b3b80d8
Only Linux is supported at the moment, and other platforms quickly fault. As a
result these tests would fail on non-Linux hosts. It may be worth making the
tests more generic again as more platforms are supported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174170 91177308-0d34-0410-b5e6-96231b3b80d8
remaining use of AliasAnalysis concepts such as isIdentifiedObject to
prove pointer inequality.
@external_compare in test/Transforms/InstSimplify/compare.ll shows a simple
case where a noalias argument can be equal to a global variable address, and
while AliasAnalysis can get away with saying that these pointers don't alias,
instsimplify cannot say that they are not equal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174122 91177308-0d34-0410-b5e6-96231b3b80d8
be equal, since there's nothing preventing a caller from correctly predicting
the stack location of an alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174119 91177308-0d34-0410-b5e6-96231b3b80d8
The AttrBuilder is for building a collection of attributes. The Attribute object
holds only one attribute. So it's not really useful for the Attribute object to
have a creator which takes an AttrBuilder.
This has two fallouts:
1. The AttrBuilder no longer holds its internal attributes in a bit-mask form.
2. The attributes are now ordered alphabetically (hence why the tests have changed).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174110 91177308-0d34-0410-b5e6-96231b3b80d8
This is a re-worked version of r174048.
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29!27 = metadata !{null}
With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28
Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.
rdar://problem/13089880
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174093 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds support for AArch64 (ARM's 64-bit architecture) to
LLVM in the "experimental" category. Currently, it won't be built
unless requested explicitly.
This initial commit should have support for:
+ Assembly of all scalar (i.e. non-NEON, non-Crypto) instructions
(except the late addition CRC instructions).
+ CodeGen features required for C++03 and C99.
+ Compilation for the "small" memory model: code+static data <
4GB.
+ Absolute and position-independent code.
+ GNU-style (i.e. "__thread") TLS.
+ Debugging information.
The principal omission, currently, is performance tuning.
This patch excludes the NEON support also reviewed due to an outbreak of
batshit insanity in our legal department. That will be committed soon bringing
the changes to precisely what has been approved.
Further reviews would be gratefully received.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174054 91177308-0d34-0410-b5e6-96231b3b80d8
register for inline asm. This conforms to how gcc allows for effective
casting of inputs into gprs (fprs is already handled).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@174008 91177308-0d34-0410-b5e6-96231b3b80d8
for example, a one-past-the-end pointer from one global variable may
be equal to the base pointer of another global variable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173995 91177308-0d34-0410-b5e6-96231b3b80d8
This is the first commit of a large series which will add support for the
QPX vector instruction set to the PowerPC backend. This instruction set is
used on the IBM Blue Gene/Q supercomputers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173973 91177308-0d34-0410-b5e6-96231b3b80d8
Given source IR:
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !14), !dbg !15
we used to generate
call void @llvm.dbg.declare(metadata !27, metadata !28), !dbg !29!27 = metadata !{null}
With this patch, we will correctly generate
call void @llvm.dbg.declare(metadata !{i32* %argc.addr}, metadata !27), !dbg !28
Looking up %argc.addr in ValueMap will return null, since %argc.addr is already
correctly set up, we can use identity mapping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173946 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a new --with-python option to allow configuration of the python binary
for building. If not specified, $PATH will be searched for common python binary
names (python, python2, python3). If specified, and the path is not executable,
it will attempt to search $PATH.
Signed-off-by: Saleem Abdulrasool <compnerd@compnerd.org>
Reviewed-by: Eric Christopher <echristo@gmail.com>, Daniel Dunbar <daniel@zuster.org>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173890 91177308-0d34-0410-b5e6-96231b3b80d8
and update ELF header e_flags.
Currently gathering information such as symbol,
section and data is done by collecting it in an
MCAssembler object. From MCAssembler and MCAsmLayout
objects ELFObjectWriter::WriteObject() forms and
streams out the ELF object file.
This patch just adds a few members to the MCAssember
class to store and access the e_flag settings. It
allows for runtime additions to the e_flag by
assembler directives. The standalone assembler can
get to MCAssembler from getParser().getStreamer().getAssembler().
This patch is the generic infrastructure and will be
followed by patches for ARM and Mips for their target
specific use.
Contributer: Jack Carter
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173882 91177308-0d34-0410-b5e6-96231b3b80d8
Changing ARMBaseTargetMachine to return ARMTargetLowering intead of
the generic one (similar to x86 code).
Tests showing which instructions were added to cast when necessary
or cost zero when not. Downcast to 16 bits are not lowered in NEON,
so costs are not there yet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173849 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM and Thumb variants of LDREXD and STREXD have different constraints and
take different operands. Previously the code expanding atomic operations didn't
take this into account and asserted in Thumb mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173780 91177308-0d34-0410-b5e6-96231b3b80d8
The AttributeSetNode contains all of the attributes. This removes one (hopefully
last) use of the Attribute class as a container of multiple attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173761 91177308-0d34-0410-b5e6-96231b3b80d8
The common code in the post-RA scheduler to break anti-dependencies on the
critical path contained a flaw. In the reported case, an anti-dependency
between the overlapping registers %X4 and %R4 exists:
%X29<def> = OR8 %X4, %X4
%R4<def>, %X3<def,dead,tied3> = LBZU 1, %X3<kill,tied1>
The unpatched code breaks the dependency by replacing %R4 and its uses
with %R3, the first register on the available list. However, %R3 and
%X3 overlap, so this creates two overlapping definitions on the same
instruction.
The fix is straightforward, preventing selection of a register that
overlaps any other defined register on the same instruction.
The test case is reduced from the bug report, and verifies that we no
longer produce "lbzu 3, 1(3)" when breaking this anti-dependency.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173706 91177308-0d34-0410-b5e6-96231b3b80d8
It is way too slow. Change the default option value to 0.
Always do exact shadow propagation for unsigned ICmp with constants, it is
cheap (under 1% cpu time) and required for correctness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173682 91177308-0d34-0410-b5e6-96231b3b80d8
Fix that by adding a cast to the shift expander. This came up with vector shifts
on sse-less X86 CPUs.
<2 x i64> = shl <2 x i64> <2 x i64>
-> i64,i64 = shl i64 i64; shl i64 i64
-> i32,i32,i32,i32 = shl_parts i32 i32 i64; shl_parts i32 i32 i64
Now we cast the last two i64s to the right type. Fixes the crash in PR14668.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173615 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support for LLVM to accept metadata that doesn't include a top level
lexical block in a function. Specifically LLVM couldn't handle this when there
were file changes relating to these blocks. I've updated a few test cases to
ensure other functionality (such as inlining) isn't affected by this change, but
haven't pervasively updated all the test cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173592 91177308-0d34-0410-b5e6-96231b3b80d8
This catches many cases where we can emit a more efficient shuffle for a
specific mask or when the mask contains undefs. Once the splat is lowered to
unpacks we can't do that anymore.
There is a possibility of moving the promotion after pshufb matching, but I'm
not sure if pshufb with a mask loaded from memory is faster than 3 shuffles, so
I avoided that for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173569 91177308-0d34-0410-b5e6-96231b3b80d8
This provides a place to add customized operation cost information and
control some other target-specific IR-level transformations.
The only non-trivial logic in this checkin assigns a higher cost to
unaligned loads and stores (covered by the included test case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173520 91177308-0d34-0410-b5e6-96231b3b80d8
The test runner does not rewrite instances of /dev/null inside the
quoted sh command. /dev/null does not exist, so opt will fail to open
it, and return a non-zero exit code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173509 91177308-0d34-0410-b5e6-96231b3b80d8
Cygwin git-svn will faithfully forward the svn properties all the way
down to the NTFS executable permission. Without the +x bit, tests using
these scripts fail with "Access Denied".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173508 91177308-0d34-0410-b5e6-96231b3b80d8
These tests in particular try to use escaped square brackets as an
argument to grep, which is failing for me with native win32 python. It
appears the backslash is being lost near the CreateProcess*() call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173506 91177308-0d34-0410-b5e6-96231b3b80d8
(defined by the x32 ABI) mode, in which case its pointers are 32-bits
in size. This knowledge is also added to X86RegisterInfo that now
returns the appropriate registers in getPointerRegClass.
There are many outcomes to this change. In order to keep the patches
separate and manageable, we start by focusing on some simple testable
cases. The patch adds a test with passing a pointer to a function -
focusing on the difference between the two data models for x86-64.
Another test is added for handling of 'sret' arguments (and
functionality is added in X86ISelLowering to make it work).
A note on naming: the "x32 ABI" document refers to the AMD64
architecture (in LLVM it's distinguished by being is64Bits() in the
x86 subtarget) with two variations: the LP64 (default) data model, and
the ILP32 data model. This patch adds predicates to the subtarget
which are consistent with this naming scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@173503 91177308-0d34-0410-b5e6-96231b3b80d8