When we hoist two loads above an if, we can preserve the nonnull metadata. We could also do the same for sinking them, but we appear to not handle metadata at all in that case.
Thanks to Hal for the review.
Differential Revision: http://reviews.llvm.org/D5910
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220392 91177308-0d34-0410-b5e6-96231b3b80d8
When a call to a double-precision libm function has fast-math semantics
(via function attribute for now because there is no IR-level FMF on calls),
we can avoid fpext/fptrunc operations and use the float version of the call
if the input and output are both float.
We already do this optimization using a command-line option; this patch just
adds the ability for fast-math to use the existing functionality.
I moved the cl::opt from InstructionCombining into SimplifyLibCalls because
it's only ever used internally to that class.
Modified the existing test cases to use the unsafe-fp-math attribute rather
than repeating all tests.
This patch should solve: http://llvm.org/bugs/show_bug.cgi?id=17850
Differential Revision: http://reviews.llvm.org/D5893
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220390 91177308-0d34-0410-b5e6-96231b3b80d8
When the profile for a function cannot be applied, we use to emit an
error. This seems extreme. The compiler can continue, it's just that the
optimization opportunities won't include profile information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220386 91177308-0d34-0410-b5e6-96231b3b80d8
The tests test/CodeGen/Generic/select-cc.ll and
test/CodeGen/PowerPC/select-cc.ll both fail with VSX enabled. The
problem is that the lowering logic for the SELECT and SELECT_CC
operations doesn't currently support the VSX registers. This patch
fixes that.
In lib/Target/PowerPC/PPCInstrInfo.td, we have pseudos to handle this
for other register classes. Similar pseudos are added in
PPCInstrVSX.td (they must be there, because the "vsrc" register class
definition appears there) for the VSRC register class. The
SELECT_VSRC pseudo is then used in pattern matching for SELECT_CC.
The rest of the patch just adds logic for SELECT_VSRC wherever similar
logic appears for SELECT_VRRC.
There are no new test cases because the existing tests above test
this, along with a variant in test/CodeGen/PowerPC/vsx.ll.
After discussion with Hal, a future patch will add similar _VSFRC
variants to override f64 type handling (currently using F8RC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220385 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When using a profile, we used to require the use -gmlt so that we could
get access to the line locations. This is used to match line numbers in
the input profile to the line numbers in the function's IR.
But this is actually not necessary. The driver can provide source
location tracking without the emission of debug information. In these
cases, the annotation 'llvm.dbg.cu' is missing from the IR, but the
actual line location annotations are still present.
This patch adds a new way of looking for the start of the current
function. Instead of looking through the compile units in llvm.dbg.cu,
we can walk up the scope for the first instruction in the function with
a debug loc. If that describes the function, we use it. Otherwise, we
keep looking until we find one.
If no such instruction is found, we then give up and produce an error.
Reviewers: echristo, dblaikie
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5887
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220382 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantFolding crashes when trying to InstSimplify the following load:
@a = private unnamed_addr constant %mst {
i8* inttoptr (i64 -1 to i8*),
i8* inttoptr (i64 -1 to i8*)
}, align 8
%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8
This patch fix this by adding support to this type of folding:
%x = load <2 x i8*>* bitcast (%mst* @a to <2 x i8*>*), align 8
==> gets folded to:
%x = <2 x i8*> <i8* inttoptr (i64 -1 to i8*), i8* inttoptr (i64 -1 to i8*)>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220380 91177308-0d34-0410-b5e6-96231b3b80d8
ParamTLS (shadow for function arguments) is of limited size. This change
makes all arguments that do not fit unpoisoned, and avoids writing
past the end of a TLS buffer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220351 91177308-0d34-0410-b5e6-96231b3b80d8
On AArch64, GOT references are page relative (ADRP + LDR), so they can't be
applied until we know exactly where, within a page, the GOT entry will be in
the target address space.
Fixes <rdar://problem/18693976>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220347 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Patches 202051 and 208013 added calls to LTO's PassManager which unconditionally add LoopVectorizePass and SLPVectorizerPass instead of following the logic in PassManagerBuilder::populateModulePassManager and honoring the -vectorize-loops -run-slp-after-loop-vectorization flags.
Reviewers: nadav, aschwaighofer, yijiang
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5884
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220345 91177308-0d34-0410-b5e6-96231b3b80d8
These are named following the IEEE-754 names for these
functions, rather than the libm fmin / fmax to avoid
possible ambiguities. Some languages may implement something
resembling fmin / fmax which return NaN if either operand is
to propagate errors. These implement the IEEE-754 semantics
of returning the other operand if either is a NaN representing
missing data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220341 91177308-0d34-0410-b5e6-96231b3b80d8
Enumerate `MDNode`'s operands *before* the node itself, so that the
reader requires less RAUW. Although this will cause different code
paths to be hit in the reader, this should effectively be no
functionality change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220340 91177308-0d34-0410-b5e6-96231b3b80d8
combineMetadata is used when merging two instructions into one. This change teaches it how to merge 'nonnull' - i.e. only preserve it on the new instruction if it's set on both sources. This isn't actually used yet since I haven't adjusted any of the call sites to pass in nonnull as a 'known metadata'.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220325 91177308-0d34-0410-b5e6-96231b3b80d8
When changing the type of a load in Chandler's recent InstCombine changes, we can preserve the new 'nonnull' metadata.
I considered adding an assert since 'nonnull' is only valid on pointer types, but casting a pointer to a non-pointer would involve more than a bitcast anyways. If someone extends this transform to handle more than bitcasts, the verifier will report the malformed IR, so a separate assertion isn't needed. Also, the fpmath flags would have the same problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220324 91177308-0d34-0410-b5e6-96231b3b80d8
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220321 91177308-0d34-0410-b5e6-96231b3b80d8
This function was complicated by the fact that it tried to perform
canonicalizations that were already preformed by InstSimplify. Remove
this extra code and move the tests over to InstSimplify. Add asserts to
make sure our preconditions hold before we make any assumptions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220314 91177308-0d34-0410-b5e6-96231b3b80d8
As coalescing registers is a benefit, the cost should be improved (i.e. made smaller) when coalescing is possible.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220302 91177308-0d34-0410-b5e6-96231b3b80d8
With VSX enabled, test/CodeGen/PowerPC/recipest.ll exposes a bug in
the FMA mutation pass. If we have a situation where a killed product
register is the same register as the FMA target, such as:
%vreg5<def,tied1> = XSNMSUBADP %vreg5<tied0>, %vreg11, %vreg5,
%RM<imp-use>; VSFRC:%vreg5 F8RC:%vreg11
then the substitution makes no sense. We end up getting a crash when
we try to extend the interval associated with the killed product
register, as there is already a live range for %vreg5 there. This
patch just disables the mutation under those circumstances.
Since recipest.ll generates different code with VMX enabled, I've
modified that test to use -mattr=-vsx. I've borrowed the code from
that test that exposed the bug and placed it in fma-mutate.ll, where
it tests several mutation opportunities including the "bad" one.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220290 91177308-0d34-0410-b5e6-96231b3b80d8
The 32-bit variants of the NEON scalar<->GPR move instructions are
also available in VFPv2. The 8- and 16-bit variants do require NEON.
Note that the checks in the test file are all -DAG because they are
checking a mixture of stdout and stderr, and the ordering is not
guaranteed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220288 91177308-0d34-0410-b5e6-96231b3b80d8
The Thumb2 LDRS?[BH] instructions are not valid when the destination
register is the PC (these encodings are used for preload hints).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220278 91177308-0d34-0410-b5e6-96231b3b80d8
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.
To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.
These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.
I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220277 91177308-0d34-0410-b5e6-96231b3b80d8
Every target we support has support for assembly that looks like
a = b - c
.long a
What is special about MachO is that the above combination suppresses the
production of a relocation.
With this change we avoid producing the intermediary labels when they don't
add any value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220256 91177308-0d34-0410-b5e6-96231b3b80d8
When functions are inlined, instructions without debug information are
attributed to the call site's DebugLoc. After inlining, inlined static
allocas are moved to the caller's entry block, adjacent to the caller's
original static alloca instructions. By retaining the call site's
DebugLoc, these instructions could cause instructions that were
subsequently inserted at the entry block to pick up the same DebugLoc.
Patch by Wolfgang Pieb!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220255 91177308-0d34-0410-b5e6-96231b3b80d8
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types:
+ MD_nontemporal = 9, // "nontemporal"
+ MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access"
+ MD_nonnull = 11 // "nonnull"
I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220248 91177308-0d34-0410-b5e6-96231b3b80d8
Range metadata applies to loads, call, and invokes. We were validating that metadata applied to loads was correct according to the LangRef, but we were not validating metadata applied to calls or invokes. This change extracts the checking functionality to a common location, reuses it for all valid locations, and adds a simple test to ensure a misused range on a call gets reported.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220246 91177308-0d34-0410-b5e6-96231b3b80d8
X86 code to lower VSELECT messed a bit with the bits set in the mask of VSELECT
when it knows it can be lowered into BLEND. Indeed, only the high bits need to be
set for those and it optimizes those accordingly.
However, when the mask is a compile time constant, the lowering will be handled
by the generic optimizer and those modifications will generate bad code in the
generic optimizer.
This patch fixes that by preventing the optimization if the VSELECT will be
handled by the generic optimizer.
<rdar://problem/18675020>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220242 91177308-0d34-0410-b5e6-96231b3b80d8
The newly introduced 'nonnull' metadata is analogous to existing 'nonnull' attributes, but applies to load instructions rather than call arguments or returns. Long term, it would be nice to combine these into a single construct. The value of the load is allowed to vary between successive loads, but null is not a valid value to be loaded by any load marked nonnull.
Reviewed by: Hal Finkel
Differential Revision: http://reviews.llvm.org/D5220
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220240 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves support for commutative instructions in the x86 memory folding implementation by attempting to fold a commuted version of the instruction if the original folding fails - if that folding fails as well the instruction is 're-commuted' back to its original order before returning.
Updated version of r219584 (reverted in r219595) - the commutation attempt now explicitly ensures that neither of the commuted source operands are tied to the destination operand / register, which was the source of all the regressions that occurred with the original patch attempt.
Added additional regression test case provided by Joerg Sonnenberger.
Differential Revision: http://reviews.llvm.org/D5818
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220239 91177308-0d34-0410-b5e6-96231b3b80d8