Currently, llvm always emits a DWARF CIE with a version of 1, even when emitting
DWARF 3 or 4, which both support CIE version 3. This patch makes it emit the
newer CIE version when we are emitting DWARF 3 or 4. This will not reduce
compatibility, as we already emit other DWARF3/4 features, and is worth doing as
the DWARF3 spec removed some ambiguities in the interpretation of call frame
information.
It also fixes a minor bug where the "return address" field of the CIE was
encoded as a ULEB128, which is only valid when the CIE version is 3. There are
no test changes for this, because (as far as I can tell) none of the platforms
that we test have a return address register with a DWARF register number >127.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211272 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by David Chisnall
His work was sponsored by: DARPA, AFRL
Some small modifications to the original patch: we now error if
it's not possible to expand an instruction (mips-expansions-bad.s has some
examples). Added some comments to the expansions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211271 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The functions that do the expansion now return false on success and true otherwise. This is so
we can catch some errors during the expansion (e.g.: immediate too large). The next patch adds some test cases.
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4214
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211269 91177308-0d34-0410-b5e6-96231b3b80d8
Before this change, the backend was unable to fold a build_vector dag
node with UNDEF operands into a single horizontal add/sub.
This patch teaches how to combine a build_vector with UNDEF operands into a
horizontal add/sub when possible. The algorithm conservatively avoids to combine
a build_vector with only a single non-UNDEF operand.
Added test haddsub-undef.ll to verify that we correctly fold horizontal binop
even in the presence of UNDEFs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211265 91177308-0d34-0410-b5e6-96231b3b80d8
* Find factorization opportunities using identity values.
* Find factorization opportunities by treating shl(X, C) as mul (X, shl(C))
* Keep NSW flag while simplifying instruction using factorization.
This fixes PR19263.
Differential Revision: http://reviews.llvm.org/D3799
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211261 91177308-0d34-0410-b5e6-96231b3b80d8
These errors are strictly unrecoverable and indicate serious issues such as
conflicting option names or an incorrectly linked LLVM distribution.
With this change, the errors actually get detected so tests don't pass
silently.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211260 91177308-0d34-0410-b5e6-96231b3b80d8
InstCombineMulDivRem has:
// Canonicalize (X+C1)*CI -> X*CI+C1*CI.
InstCombineAddSub has:
// W*X + Y*Z --> W * (X+Z) iff W == Y
These two transforms could fight with each other if C1*CI would not fold
away to something simpler than a ConstantExpr mul.
The InstCombineMulDivRem transform only acted on ConstantInts until
r199602 when it was changed to operate on all Constants in order to
let it fire on ConstantVectors.
To fix this, make this transform more careful by checking to see if we
actually folded away C1*CI.
This fixes PR20079.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211258 91177308-0d34-0410-b5e6-96231b3b80d8
used by all of the MC level tools and codegen. Fix up all uses
in the compiler to use this and set it on the context accordingly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211257 91177308-0d34-0410-b5e6-96231b3b80d8
We would get confused by '@' characters in symbol names, we would
mistake the text following them for the variant kind.
When an identifier a string, the variant kind will never show up inside
of it. Instead, check to see if there is a variant following the
string.
This fixes PR19965.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211249 91177308-0d34-0410-b5e6-96231b3b80d8
These will be used for custom lowering and for library
implementations of various math functions, so it's useful
to expose these as builtins.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211247 91177308-0d34-0410-b5e6-96231b3b80d8
This required untangling a mess of headers that included around.
This a recommit of r210953 with a fix for the removed accessor
for JITInfo.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211233 91177308-0d34-0410-b5e6-96231b3b80d8
fat files containing archives.
Also fix a bug in MachOUniversalBinary::ObjectForArch::ObjectForArch()
where it needed a >= when comparing the Index with the number of
objects in a fat file. As the index starts at 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211230 91177308-0d34-0410-b5e6-96231b3b80d8
The difference from rint isn't really relevant here,
so treat them as equivalent. OpenCL doesn't have nearbyint,
so this is sort of pointless other than for completeness.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211229 91177308-0d34-0410-b5e6-96231b3b80d8
This contains all the previous patches + getlod support on top of it.
It doesn't use SDNodes anymore, so it's quite small.
It also adds v16i8 to SReg_128, which is used for the sampler descriptor.
Reviewed-by: Tom Stellard
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211228 91177308-0d34-0410-b5e6-96231b3b80d8
This change has a bit of a trickle down effect due to the fact that
there are a number of derived implementations of ExecutionEngine,
and that the mutex is not tightly encapsulated so is used by other
classes directly.
Reviewed by: rnk
Differential Revision: http://reviews.llvm.org/D4196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211214 91177308-0d34-0410-b5e6-96231b3b80d8
LLVMGetBitcodeModuleInContext should not take ownership on error. I will
try to localize this odd api requirement, but this should get the bots green.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211213 91177308-0d34-0410-b5e6-96231b3b80d8
The main point of this class is to provide a cheap object interface to a bitcode
file, so it has to be as lazy as possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211207 91177308-0d34-0410-b5e6-96231b3b80d8
We do have use cases for the bitcode reader owning the buffer or not, but we
always know which one we have when we construct it.
It might be possible to simplify this further, but this is a step in the
right direction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211205 91177308-0d34-0410-b5e6-96231b3b80d8
When emitting optimization remarks, we test for the presence of
instruction locations by testing for a valid llvm.dbg.cu annotation.
This is slightly inefficient because we can simply ask whether the
debug location we have is known or not.
Additionally, if my current plan works, I will need to remove the
llvm.dbg.cu annotation from the IL (or prevent it from being generated)
when -Rpass is used without -g. In those cases, we'll want to generate
line tables but we will want to prevent code generation from emitting
DWARF code for them.
Tested on x86_64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211204 91177308-0d34-0410-b5e6-96231b3b80d8
When looking at the 64-bit SVR4 indirect call sequence, I noticed
an unnecessary load of r12. And indeed the code says:
// R12 must contain the address of an indirect callee.
But this is not correct; in the 64-bit SVR4 (ELFv1) ABI, there is
no need to load r12 at this point. It seems this code and comment
is a remnant of code originally shared with the Darwin ABI ...
This patch simply removes the unnecessary load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211203 91177308-0d34-0410-b5e6-96231b3b80d8
ARMTargetStreamer implements ConstantPool and AssmeblerConstantPools
to keep track of assembler-generated constant pools that are used for
ldr-pseudo.
When implementing ldr-pseudo for AArch64, these two classes can be reused.
So this patch factors them out from ARM target to the general MC lib.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211198 91177308-0d34-0410-b5e6-96231b3b80d8
During an indirect function call sequence on the 64-bit SVR4 ABI,
generate code must load and then restore the TOC register.
This does not use a regular LOAD instruction since the TOC
register r2 is marked as reserved. Instead, the are two
special instruction patterns:
let RST = 2, DS = 2 in
def LDinto_toc: DSForm_1a<58, 0, (outs), (ins g8rc:$reg),
"ld 2, 8($reg)", IIC_LdStLD,
[(PPCload_toc i64:$reg)]>, isPPC64;
let RST = 2, DS = 10, RA = 1 in
def LDtoc_restore : DSForm_1a<58, 0, (outs), (ins),
"ld 2, 40(1)", IIC_LdStLD,
[(PPCtoc_restore)]>, isPPC64;
Note that these not only restrict the destination of the
load to r2, but they also restrict the *source* of the
load to particular address combinations. The latter is
a problem when we want to support the ELFv2 ABI, since
there the TOC save slot is no longer at 40(1).
This patch replaces those two instructions with a single
instruction pattern that only hard-codes r2 as destination,
but supports generic addresses as source. This will allow
supporting the ELFv2 ABI, and also helps generate more
efficient code for calls to absolute addresses (allowing
simplification of the ppc64-calls.ll test case).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211193 91177308-0d34-0410-b5e6-96231b3b80d8
Note that I followed the AVX2 convention here and didn't add LLVM intrinsics
for stores. These can be generated with the nontemporal hint on LLVM IR
stores (see new test). The GCC builtins are lowered directly into nontemporal
stores.
<rdar://problem/17082571>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211176 91177308-0d34-0410-b5e6-96231b3b80d8
Use the max 64-bit element size with EVEX_CD8. This should work since element
size is ignored for a full-vector access (FVM).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211175 91177308-0d34-0410-b5e6-96231b3b80d8
The PowerPC back-end uses BLA to implement calls to functions at
known-constant addresses, which is apparently used for certain
system routines on Darwin.
However, with the 64-bit SVR4 ABI, this is actually incorrect.
An immediate function pointer value on this platform is not
directly usable as a target address for BLA:
- in the ELFv1 ABI, the function pointer value refers to the
*function descriptor*, not the code address
- in the ELFv2 ABI, the function pointer value refers to the
global entry point, but BL(A) would only be correct when
calling the *local* entry point
This bug didn't show up since using immediate function pointer
values is not usually done in the 64-bit SVR4 ABI in the first
place. However, I ran into this issue with a certain use case
of LLVM as JIT, where immediate function pointer values were
uses to implement callbacks from JITted code to helpers in
statically compiled code.
Fixed by simply not using BLA with the 64-bit SVR4 ABI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211174 91177308-0d34-0410-b5e6-96231b3b80d8
My patch r204634 to emit instructions in little-endian format failed to
handle those special cases where we emit a pair of instructions from a
single LLVM MC instructions (like the bl; nop pairs used to implement
the call sequence).
In those cases, we still need to emit the "first" instruction (the one
in the more significant word) first, on both big and little endian,
and not swap them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211171 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The assembler tries to reuse the destination register for memory operations whenever
it can but it's not possible to do so if the destination register is not a GPR.
Example:
ldc1 $f0, sym
should expand to:
lui $at, %hi(sym)
ldc1 $f0, %lo(sym)($at)
It's entirely wrong to expand to:
lui $f0, %hi(sym)
ldc1 $f0, %lo(sym)($f0)
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D4173
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211169 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch doesn't really change the logic behind expandMemInst but it allows
us to assemble .S files that use .set noat with some macros. For example:
.set noat
lw $k0, offset($k1)
Can expand to:
lui $k0, %hi(offset)
addu $k0, $k0, $k1
lw $k0, %lo(offset)($k0)
with no need to access $at.
Reviewers: dsanders, vmedic
Reviewed By: dsanders, vmedic
Differential Revision: http://reviews.llvm.org/D4159
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211165 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Added negative test case so that we can be sure we handle erroneous situations
while parsing the .cpsetup directive.
Reviewers: dsanders
Reviewed By: dsanders
Differential Revision: http://reviews.llvm.org/D3681
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211160 91177308-0d34-0410-b5e6-96231b3b80d8
It looks like there are two versions of LowerCallTo here: the
SelectionDAGBuilder one is designed to operate on LLVM IR, and the
TargetLowering one in the case where everything is at DAG level.
Previously, only the SelectionDAGBuilder variant could handle demoting
an impossible return to sret semantics (before delegating to the
TargetLowering version), but this functionality is also useful for
certain libcalls (e.g. 128-bit operations on 32-bit x86). So this
commit moves the sret handling down a level.
rdar://problem/17242889
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211155 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Provides an abstraction for a random number generator (RNG) that produces a stream of pseudo-random numbers.
The current implementation uses C++11 facilities and is therefore not cryptographically secure.
The RNG is salted with the text of the current command line invocation.
In addition, a user may specify a seed (reproducible builds).
In clang, the seed can be set via
-frandom-seed=X
In the back end, the seed can be set via
-rng-seed=X
This is the llvm part of the patch.
clang part: D3391
Reviewers: ahomescu, rinon, nicholas, jfb
Reviewed By: jfb
Subscribers: jfb, perl
Differential Revision: http://reviews.llvm.org/D3390
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211145 91177308-0d34-0410-b5e6-96231b3b80d8
ReconstructShuffle() may wrongly creat a CONCAT_VECTOR trying to
concat 2 of v2i32 into v4i16. This commit is to fix this issue and
try to generate UZP1 instead of lots of MOV and INS.
Patch is initalized by Kevin Qin, and refactored by Tim Northover.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211144 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is a follow up to r211040 & r211052. Rather than bailing out of fast
isel this patch will generate an alternate instruction (movabsq) instead of the
leaq. While this will always have enough room to handle the 64 bit displacment
it is generally over kill for internal symbols (most displacements will be
within 32 bits) but since we have no way of communicating the code model to the
the assmebler in order to avoid flagging an absolute leal/leaq as illegal when
using a symbolic displacement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211130 91177308-0d34-0410-b5e6-96231b3b80d8
This optimizes predicates for certain compares, such as fcmp oeq %x, %x to
fcmp ord %x, %x. The latter one is more efficient to generate.
The same optimization is applied to conditional branches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211126 91177308-0d34-0410-b5e6-96231b3b80d8
This pattern loses some of its usefulness when the mutex type is
statically polymorphic as opposed to runtime polymorphic, as
swapping out the mutex type requires changing a significant number
of function parameters, and templatizing the function parameter
requires the methods to be defined in the headers.
Furthermore, if LLVM is compiled with threads disabled then there
may even be no mutex to acquire anyway, so it should not be up to
individual APIs to know whether or not acquiring a mutex is required
to use those APIs to begin with. It should be up to the user of the
API.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211125 91177308-0d34-0410-b5e6-96231b3b80d8
The OSX ranlib warns on files with no symbols, and lib/Support/WindowsError.cpp
was empty when building on non-windows.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211118 91177308-0d34-0410-b5e6-96231b3b80d8
We need to store a value greater than or equal to the number of LDS
bytes allocated by the shader in the m0 register in order for LDS
instructions to work correctly.
We always initialize m0 at the beginning of a shader, but this register
is also used for indirect addressing offsets, so we need to
re-initialize it any time we use indirect addressing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211107 91177308-0d34-0410-b5e6-96231b3b80d8
To make sure branches are in range, we need to do a better job of estimating
the length of an inline assembly block than "it's probably 1 instruction, who'd
write asm with more than that?".
Fortunately there's already a (highly suspect, see how many ways you can think
of to break it!) callback for this purpose, which is used by the other targets.
rdar://problem/17277590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211095 91177308-0d34-0410-b5e6-96231b3b80d8
Multiplication by an integer with a number of trailing zero bits leaves
the same number of lower bits of the result initialized to zero.
This change makes MSan take this into account in the case of multiplication by
a compile-time constant.
We don't handle the general, non-constant, case because
(a) it's not going to be cheap (computation-wise);
(b) multiplication by a partially uninitialized value in user code is
a bad idea anyway.
Constant case must be handled because it appears from LLVM optimization of a
completely valid user code, as the test case in compiler-rt demonstrates.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211092 91177308-0d34-0410-b5e6-96231b3b80d8
Mimic r116632 in passing LLVM_VERSION_INFO from the Makefile build
system to the build. This improves the -version output of tools that
use llvm::cl under the configure+make system.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211091 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As a starting step, we only use one simple heuristic: if the sign bits
of both a and b are zero, we can prove "add a, b" do not unsigned
overflow, and thus convert it to "add nuw a, b".
Updated all affected tests and added two new tests (@zero_sign_bit and
@zero_sign_bit2) in AddOverflow.ll
Test Plan: make check-all
Reviewers: eliben, rafael, meheff, chandlerc
Reviewed By: chandlerc
Subscribers: chandlerc, llvm-commits
Differential Revision: http://reviews.llvm.org/D4144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211084 91177308-0d34-0410-b5e6-96231b3b80d8
r199771 accidently broke the logic that makes sure that SROA only splits
load on byte boundaries. If such a split happens, some bits get lost
when reassembling loads of wider types, causing data corruption.
Move the width check up to reject such splits early, avoiding the
corruption. Fixes PR19250.
Patch by: Björn Steinbrink <bsteinbr@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211082 91177308-0d34-0410-b5e6-96231b3b80d8
Make use of helper functions to simplify the branch and compare instruction
selection in FastISel. Also add test cases for compare and conditonal branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211077 91177308-0d34-0410-b5e6-96231b3b80d8
[This is resubmitting r210721, which was reverted due to suspected breakage
which turned out to be unrelated].
Some extra review comments were addressed. See D4090 and D4147 for more details.
The Clang change that produces this metadata was committed in r210667
Patch by Mark Heffernan.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211076 91177308-0d34-0410-b5e6-96231b3b80d8
These parameters are intended to serve as sort of a contract that
you cannot access the functions outside of a mutex. However, the
entire JIT class cannot be accessed outside of a mutex anyway, and
all methods acquire a lock as soon as they are entered. Since the
containing class already is not intended to be thread-safe, it only
serves to add code clutter.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211071 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patches allows non conversions like i1=i2; where both are global ints.
In addition, arithmetic and other things start to work since fast-isel will use
existing patterns for non fast-isel from tablegen files where applicable.
In addition i8, i16 will work in this limited context for assignment without the need
for sign extension (zero or signed). It does not matter how i8 or i16 are loaded (zero or sign extended)
since only the 8 or 16 relevant bits are used and clang will ask for sign extension before using them in
arithmetic. This is all made more complete in forthcoming patches.
for example:
int i, j=1, k=3;
void foo() {
i = j + k;
}
Keep in mind that this pass is not enabled right now and is an experimental pass
It can only be enabled with a hidden option to llvm of -mips-fast-isel.
Test Plan: Run test-suite, loadstore2.ll and I will run some executable tests.
Reviewers: dsanders
Subscribers: mcrosier
Differential Revision: http://reviews.llvm.org/D3856
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211061 91177308-0d34-0410-b5e6-96231b3b80d8
Define an intrinsic for the frontend to use and pattern match it to
the RBIT instruction.
rdar://9283021
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211058 91177308-0d34-0410-b5e6-96231b3b80d8
We already have an ARMISD node. Create an intrinsic to map to it so we can
add support for the frontend __rbit() intrinsic.
rdar://9283021
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211057 91177308-0d34-0410-b5e6-96231b3b80d8
Rafael opened http://llvm.org/bugs/show_bug.cgi?id=19893 to track non-optimal
code generation for forming a function address that is local to the compile
unit. The existing code was treating both local and non-local functions
identically.
This patch fixes the problem by properly identifying local functions and
generating the proper addis/addi code. I also noticed that Rafael's earlier
changes to correct the surrounding code in PPCISelLowering.cpp were also
needed for fast instruction selection in PPCFastISel.cpp, so this patch
fixes that code as well.
The existing test/CodeGen/PowerPC/func-addr.ll is modified to test the new
code generation. I've added a -O0 run line to test the fast-isel code as
well.
Tested on powerpc64[le]-unknown-linux-gnu with no regressions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211056 91177308-0d34-0410-b5e6-96231b3b80d8
These were being used as unreferenced parameters to enforce that
the methods must not be called without holding a mutex, but all
of the methods in question were internal, and the methods were
only exposed through an interface whose entire purpose was to
serialize access to these structures, so expecting the methods
to be accessed under a mutex is reasonable enough.
Reviewed by: blaikie
Differential Revision: http://reviews.llvm.org/D4162
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211054 91177308-0d34-0410-b5e6-96231b3b80d8
Added comment to clarify why we r211040 choose to bail out of fast isel instead
of generating a more complicated relocation, and fix mislabelled register in the
comments of the asan test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211052 91177308-0d34-0410-b5e6-96231b3b80d8
ARM v7M has ldrex/strex but not ldrexd/strexd. This means 32-bit
operations should work as normal, but 64-bit ones are almost certainly
doomed.
Patch by Phoebe Buckheister.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211042 91177308-0d34-0410-b5e6-96231b3b80d8
On x86_86 the lea instruction can only use a 32 bit immediate value. When
the code is compiled statically the RIP register is not used, meaning the
immediate is all that can be used for the relocation, which is not sufficient
in the case of targets more than +/- 2GB away. This patch bails out of fast
isel in those cases and reverts to DAG which does the right thing.
Test case included.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211040 91177308-0d34-0410-b5e6-96231b3b80d8
When LowerSwitch transforms a switch instruction into a tree of ifs it
is actually performing a binary search into the various case ranges, to
see if the current value falls into one cases range of values.
So, if we have a program with something like this:
switch (a) {
case 0:
do0();
break;
case 1:
do1();
break;
case 2:
do2();
break;
default:
break;
}
the code produced is something like this:
if (a < 1) {
if (a == 0) {
do0();
}
} else {
if (a < 2) {
if (a == 1) {
do1();
}
} else {
if (a == 2) {
do2();
}
}
}
This code is inefficient because the check (a == 1) to execute do1() is
not needed.
The reason is that because we already checked that (a >= 1) initially by
checking that also (a < 2) we basically already inferred that (a == 1)
without the need of an extra basic block spawned to check if actually (a
== 1).
The patch addresses this problem by keeping track of already
checked bounds in the LowerSwitch algorithm, so that when the time
arrives to produce a Leaf Block that checks the equality with the case
value / range the algorithm can decide if that block is really needed
depending on the already checked bounds .
For example, the above with "a = 1" would work like this:
the bounds start as LB: NONE , UB: NONE
as (a < 1) is emitted the bounds for the else path become LB: 1 UB:
NONE. This happens because by failing the test (a < 1) we know that the
value "a" cannot be smaller than 1 if we enter the else branch.
After the emitting the check (a < 2) the bounds in the if branch become
LB: 1 UB: 1. This is because by checking that "a" is smaller than 2 then
the upper bound becomes 2 - 1 = 1.
When it is time to emit the leaf block for "case 1:" we notice that 1
can be squeezed exactly in between the LB and UB, which means that if we
arrived to that block there is no need to emit a block that checks if (a
== 1).
Patch by: Marcello Maggioni <hayarms@gmail.com>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211038 91177308-0d34-0410-b5e6-96231b3b80d8
Originally I switched the LD/ST optimizer off in TargetMachine as it was previously, but Eric has suggested he'd prefer that it be short-circuited in the pass itself.
No functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211037 91177308-0d34-0410-b5e6-96231b3b80d8
This makes llvm-nm ignore members that are not sufficiently aligned for
lib/Object to handle.
These archives are invalid. GNU AR is able to handle this, but in general
just warns about broken archive members.
We should probably start warning too, but for now just make sure llvm-nm
exits with an 0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211036 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have c++11, even things like ErrorOr<std::unique_ptr<...>> are
easy to use.
No intended functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211033 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
There is no change to the restrictions, just the result register is stored
once in the encoding rather than twice. The rt field is zero in
MIPS32r6/MIPS64r6.
Depends on D4119
Reviewers: zoran.jovanovic, jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4120
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211019 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The linked-load, store-conditional operations have been re-encoded such
that have a 9-bit offset instead of the 16-bit offset they have prior to
MIPS32r6/MIPS64r6.
While implementing this, I noticed that the atomic load/store pseudos always
emit a sign extension using sll and sra. I have improved this to use seb/seh
when they are available (MIPS32r2/MIPS64r2 and above).
Depends on D4118
Reviewers: jkolek, zoran.jovanovic, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211018 91177308-0d34-0410-b5e6-96231b3b80d8
given in the Unicode spec
That is, replace every maximal subpart of an ill-formed subsequence with one
U+FFFD.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211015 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The error message for the invalid.s cases isn't very helpful. It happens because
there is an instruction with a wider immediate that would have matched if the
NotMips32r6 predicate were true. I have some WIP to improve the message but it
affects most error messages for removed/re-encoded instructions on
MIPS32r6/MIPS64r6 and should therefore be a separate commit.
Depens on D4115
Reviewers: zoran.jovanovic, jkolek, vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D4117
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211012 91177308-0d34-0410-b5e6-96231b3b80d8
As a follow-up to r210375 which canonicalizes addrspacecast
instructions, this patch canonicalizes addrspacecast constant
expressions.
Given clang uses ConstantExpr::getAddrSpaceCast to emit addrspacecast
cosntant expressions, this patch is also a step towards having the
frontend emit canonicalized addrspacecasts.
Piggyback a minor refactor in InstCombineCasts.cpp
Update three affected tests in addrspacecast-alias.ll,
access-non-generic.ll and constant-fold-gep.ll and added one new test in
constant-fold-address-space-pointer.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211004 91177308-0d34-0410-b5e6-96231b3b80d8
I haven't nailed this down entirely, but this is about as small of a
test case as I can seem to construct and adequately demonstrates the
crasher. I'll continue investigating the root cause/fix(es).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210993 91177308-0d34-0410-b5e6-96231b3b80d8
It's valid to use FP_TO_SINT when asking for a smaller type (e.g. all
"unsigned int16" values fit into a "signed int32"), but the reverse
isn't true.
Unfortunately, I'm not actually aware of any architecture with
asymmetric FP_TO_SINT and FP_TO_UINT handling and the logic happens to
work in the symmetric case, so I can't actually write a test for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210986 91177308-0d34-0410-b5e6-96231b3b80d8
There's probably no acatual change in behaviour here, just updating
the LowerFP_TO_INT function to be more similar to the reverse
implementation and updating costs to current CodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210985 91177308-0d34-0410-b5e6-96231b3b80d8
This would assert if a constant address space was extern
and therefore didn't have an initializer. If the initializer
was undef, it would hit the unreachable unhandled initializer case.
An extern global should never really occur since we don't have
machine linking, but bugpoint likes to remove initializers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210967 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we handle finding abstract variables at the end of the module,
remove the upfront handling and just ensure the abstract variable is
built when necessary.
In theory we could have a split implementation, where inlined variables
are immediately constructed referencing the abstract definition, and
concrete variables are delayed - but let's go with one solution for now
unless there's a reason not to.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210961 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is to move GlobalMerge pass from Transform/Scalar
to CodeGen, because GlobalMerge depends on TargetMachine.
In the mean time, the macro INITIALIZE_TM_PASS is also moved
to CodeGen/Passes.h. With this fix we can avoid making
libScalarOpts depend on libCodeGen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210951 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than relying on abstract variables looked up at the time the
concrete variable is created, look them up at the end of the module to
ensure they're referenced even if they're created after the concrete
definition. This completes/matches the work done in r209677 to handle
this for the subprograms themselves.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@210946 91177308-0d34-0410-b5e6-96231b3b80d8