These are named following the IEEE-754 names for these
functions, rather than the libm fmin / fmax to avoid
possible ambiguities. Some languages may implement something
resembling fmin / fmax which return NaN if either operand is
to propagate errors. These implement the IEEE-754 semantics
of returning the other operand if either is a NaN representing
missing data.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220341 91177308-0d34-0410-b5e6-96231b3b80d8
This requires incorporating __GNUC_PATCHLEVEL__ into our prerequisite
check, and renaming our __GNUC_PREREQ to LLVM_GNUC_PREREQ, since it is
now functionally different.
Patch by Chilledheart!
Differential Revision: http://reviews.llvm.org/D5879
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220332 91177308-0d34-0410-b5e6-96231b3b80d8
This enables targets to adapt their pass pipeline to the register
allocator in use. For example, with the AArch64 backend, using PBQP
with the cortex-a57, the FPLoadBalancing pass is no longer necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220321 91177308-0d34-0410-b5e6-96231b3b80d8
It is just too easy to use a virtual register intead of a NodeId without a
compiler warning. This does not fix the fundamental problem, i.e. both
have the same underlying types, but increases the likelyhood to detect it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220303 91177308-0d34-0410-b5e6-96231b3b80d8
inttoptr or ptrtoint cast provided there is datalayout available.
Eventually, the datalayout can just be required but in practice it will
always be there today.
To go with the ability to expose available values requiring a ptrtoint
or inttoptr cast, helpers are added to perform one of these three casts.
These smarts are necessary to finish canonicalizing loads and stores to
the operational type requirements without regressing fundamental
combines.
I've added some test cases. These should actually improve as the load
combining and store combining improves, but they may fundamentally be
highlighting some missing combines for select in addition to exercising
the specific added logic to load analysis.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220277 91177308-0d34-0410-b5e6-96231b3b80d8
Every target we support has support for assembly that looks like
a = b - c
.long a
What is special about MachO is that the above combination suppresses the
production of a relocation.
With this change we avoid producing the intermediary labels when they don't
add any value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220256 91177308-0d34-0410-b5e6-96231b3b80d8
Our metadata scheme lazily assigns IDs to string metadata, but we have a mechanism to preassign them as well. Using a preassigned ID is helpful since we get compile time type checking, and avoid some (minimal) string construction and comparison. This change adds enum value for three existing metadata types:
+ MD_nontemporal = 9, // "nontemporal"
+ MD_mem_parallel_loop_access = 10, // "llvm.mem.parallel_loop_access"
+ MD_nonnull = 11 // "nonnull"
I went through an updated various uses as well. I made no attempt to get all uses; I focused on the ones which were easily grepable and easily to translate. For example, there were several items in LoopInfo.cpp I chose not to update.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220248 91177308-0d34-0410-b5e6-96231b3b80d8
be BigEndian so the default can continue to be zero-initialized.
This is one of the prerequisites to making DataLayout a constant and
always available part of every module.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220193 91177308-0d34-0410-b5e6-96231b3b80d8
this header to remove numerous formatting inconsistencies that impede
making simple changes here without large diffs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220192 91177308-0d34-0410-b5e6-96231b3b80d8
implementation.
This is good for a ~6% reduction in total compile time on the nightly test suite
when running with -regalloc=pbqp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220183 91177308-0d34-0410-b5e6-96231b3b80d8
This operation is analogous to its counterpart in DenseMap: It allows lookup
via cheap-to-construct keys (provided that getHashValue and isEqual are
implemented for the cheap key-type in the DenseMapInfo specialization).
Thanks to Chandler for the review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220168 91177308-0d34-0410-b5e6-96231b3b80d8
The only difference from r219829 is using
getOrCreateSectionSymbol(*ELFSec)
instead of
GetOrCreateSymbol(ELFSec->getSectionName())
in ELFObjectWriter which causes us to use the correct section symbol even if
we have multiple sections with the same name.
Original messages:
r219829:
Correctly handle references to section symbols.
When processing assembly like
.long .text
we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.
This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.
The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.
r219835:
Allow forward references to section symbols.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220021 91177308-0d34-0410-b5e6-96231b3b80d8
Revert "Correctly handle references to section symbols."
Revert "Allow forward references to section symbols."
Rui found a regression I am debugging.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@220010 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Backends can use setInsertFencesForAtomic to signal to the middle-end that
montonic is the only memory ordering they can accept for
stores/loads/rmws/cmpxchg. The code lowering those accesses with a stronger
ordering to fences + monotonic accesses is currently living in
SelectionDAGBuilder.cpp. In this patch I propose moving this logic out of it
for several reasons:
- There is lots of redundancy to avoid: extremely similar logic already
exists in AtomicExpand.
- The current code in SelectionDAGBuilder does not use any target-hooks, it
does the same transformation for every backend that requires it
- As a result it is plain *unsound*, as it was apparently designed for ARM.
It happens to mostly work for the other targets because they are extremely
conservative, but Power for example had to switch to AtomicExpand to be
able to use lwsync safely (see r218331).
- Because it produces IR-level fences, it cannot be made sound ! This is noted
in the C++11 standard (section 29.3, page 1140):
```
Fences cannot, in general, be used to restore sequential consistency for atomic
operations with weaker ordering semantics.
```
It can also be seen by the following example (called IRIW in the litterature):
```
atomic<int> x = y = 0;
int r1, r2, r3, r4;
Thread 0:
x.store(1);
Thread 1:
y.store(1);
Thread 2:
r1 = x.load();
r2 = y.load();
Thread 3:
r3 = y.load();
r4 = x.load();
```
r1 = r3 = 1 and r2 = r4 = 0 is impossible as long as the accesses are all seq_cst.
But if they are lowered to monotonic accesses, no amount of fences can prevent it..
This patch does three things (I could cut it into parts, but then some of them
would not be tested/testable, please tell me if you would prefer that):
- it provides a default implementation for emitLeadingFence/emitTrailingFence in
terms of IR-level fences, that mimic the original logic of SelectionDAGBuilder.
As we saw above, this is unsound, but the best that can be done without knowing
the targets well (and there is a comment warning about this risk).
- it then switches Mips/Sparc/XCore to use AtomicExpand, relying on this default
implementation (that exactly replicates the logic of SelectionDAGBuilder, so no
functional change)
- it finally erase this logic from SelectionDAGBuilder as it is dead-code.
Ideally, each target would define its own override for emitLeading/TrailingFence
using target-specific fences, but I do not know the Sparc/Mips/XCore memory model
well enough to do this, and they appear to be dealing fine with the ARM-inspired
default expansion for now (probably because they are overly conservative, as
Power was). If anyone wants to compile fences more agressively on these
platforms, the long comment should make it clear why he should first override
emitLeading/TrailingFence.
Test Plan: make check-all, no functional change
Reviewers: jfb, t.p.northover
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5474
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219957 91177308-0d34-0410-b5e6-96231b3b80d8
If a square root call has an FP multiplication argument that can be reassociated,
then we can hoist a repeated factor out of the square root call and into a fabs().
In the simplest case, this:
y = sqrt(x * x);
becomes this:
y = fabs(x);
This patch relies on an earlier optimization in instcombine or reassociate to put the
multiplication tree into a canonical form, so we don't have to search over
every permutation of the multiplication tree.
Because there are no IR-level FastMathFlags for intrinsics (PR21290), we have to
use function-level attributes to do this optimization. This needs to be fixed
for both the intrinsics and in the backend.
Differential Revision: http://reviews.llvm.org/D5787
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219944 91177308-0d34-0410-b5e6-96231b3b80d8
Clang CodeGen had a utility function for creating pointer alignment assumptions
using the @llvm.assume intrinsic. This functionality will also be needed by the
inliner (to preserve function-argument alignment attributes when inlining), so
this moves the utility function into IRBuilder where it can be used both by
Clang CodeGen and also other LLVM-level code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219875 91177308-0d34-0410-b5e6-96231b3b80d8
This CL introduces MachOObjectFile::getUuid(). This function returns an ArrayRef to the object file's UUID, or an empty ArrayRef if the object file doesn't contain an LC_UUID load command.
The new function is gonna be used by llvm-symbolizer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219866 91177308-0d34-0410-b5e6-96231b3b80d8
Store `User::NumOperands` (and `MDNode::NumOperands`) in `Value`.
On 64-bit host architectures, this reduces `sizeof(User)` and all
subclasses by 8, and has no effect on `sizeof(Value)` (or, incidentally,
on `sizeof(MDNode)`).
On 32-bit host architectures, this increases `sizeof(Value)` by 4.
However, it has no effect on `sizeof(User)` and `sizeof(MDNode)`, so the
only concrete subclasses of `Value` that actually see the increase are
`BasicBlock`, `Argument`, `InlineAsm`, and `MDString`. Moreover, I'll
be shocked and confused if this causes a tangible memory regression.
This has no functionality change (other than memory footprint).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219845 91177308-0d34-0410-b5e6-96231b3b80d8
A follow-up commit will modify the memory-layout of `Value`, `User`, and
`MDNode`. First fix the comments to be doxygen-friendly (and to follow
the coding standards).
- Use "\brief" instead of "repeatedName -".
- Add a brief intro where it was missing.
- Remove duplicated comments from source files (and a couple of
noisy/trivial comments altogether).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219844 91177308-0d34-0410-b5e6-96231b3b80d8
When processing assembly like
.long .text
we were creating a new undefined symbol .text. GAS on the other hand would
handle that as a reference to the .text section.
This patch implements that by creating the section symbols earlier so that
they are visible during asm parsing.
The patch also updates llvm-readobj to print the symbol number in the relocation
dump so that the test can differentiate between two sections with the same name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219829 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently an error is thrown if bundle alignment mode is set more than once
per module (either via the API or the .bundle_align_mode directive). This
change allows setting it multiple times as long as the alignment doesn't
change.
Also nested bundle_lock groups are currently not allowed. This change allows
them, with the effect that the group stays open until all nests are exited,
and if any of the bundle_lock directives has the align_to_end flag, the
group becomes align_to_end.
These changes make the bundle aligment simpler to use in the compiler, and
also better match the corresponding support in GNU as.
Reviewers: jvoung, eliben
Differential Revision: http://reviews.llvm.org/D5801
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219811 91177308-0d34-0410-b5e6-96231b3b80d8
A number of comment cleanups:
- Remove duplicated function and class names from comments.
- Remove duplicated comments from source file (some of which were
out-of-sync).
- Move any unduplicated comments from source file to header.
- Remove some noisy comments entirely (e.g., a comment for
`DIDescriptor::print()` saying "print descriptor" just gets in the
way of reading the code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219801 91177308-0d34-0410-b5e6-96231b3b80d8
Remove a strange round-trip through named metadata to assign preserved
local variables to their subprograms.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219798 91177308-0d34-0410-b5e6-96231b3b80d8
Peephole optimization that generates a single conditional branch
for csinc-branch sequences like in the examples below. This is
possible when the csinc sets or clears a register based on a condition
code and the branch checks that register. Also the condition
code may not be modified between the csinc and the original branch.
Examples:
1. Convert csinc w9, wzr, wzr, <CC>;tbnz w9, #0, 0x44
to b.<invCC>
2. Convert csinc w9, wzr, wzr, <CC>; tbz w9, #0, 0x44
to b.<CC>
rdar://problem/18506500
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219742 91177308-0d34-0410-b5e6-96231b3b80d8
A few minor changes to prevent @llvm.assume from interfering with loop
vectorization. First, treat @llvm.assume like the lifetime intrinsics, which
are scalarized (but don't otherwise interfere with the legality checking).
Second, ignore the cost of ephemeral instructions in the loop (these will go
away anyway during CodeGen).
Alignment assumptions and other uses of @llvm.assume can often end up inside of
loops that should be vectorized (this is not uncommon for assumptions generated
by __attribute__((align_value(n))), for example).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219741 91177308-0d34-0410-b5e6-96231b3b80d8
Eliminate library calls and intrinsic calls to fabs when the input
is a squared value.
Note that no unsafe-math / fast-math assumptions are needed for
this optimization.
Differential Revision: http://reviews.llvm.org/D5777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219717 91177308-0d34-0410-b5e6-96231b3b80d8