a test case that was just grepping the debug stats output rather than
actually checking the generated code for anything useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218951 91177308-0d34-0410-b5e6-96231b3b80d8
intergrated much more fully into some logical part of the backend to
really understand what it is trying to accomplish and how to update it.
I suspect it no longer holds enough value to be worth having.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218950 91177308-0d34-0410-b5e6-96231b3b80d8
shufle switch.
I nuked a win64 config from one test as it doesn't really make sense to
cover that ABI specially for generic v2f32 tests...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218948 91177308-0d34-0410-b5e6-96231b3b80d8
two functions that really didn't have any interesting assertions, and
generated more precise tests for one of the others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218946 91177308-0d34-0410-b5e6-96231b3b80d8
This patch broke 447.dealII on Darwin. I'm currently working on a reduced
test-case, but reverting for now to keep the bots happy.
<rdar://problem/18530107>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218944 91177308-0d34-0410-b5e6-96231b3b80d8
test cases that will change with the new vector shuffle lowering. This
gives us a nice baseline for deltas against. I've checked and removed
the cases where there were weird register usage being pinned down, and
all of these are extremely pin-pointed tests so fully checking them
seems very appropriate.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218941 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I changed various bits of the compilation of atomics recently, and forgot
updating the documentation. This patch just brings it up to date.
Test Plan: no change to the code
Reviewers: jfb
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5590
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218937 91177308-0d34-0410-b5e6-96231b3b80d8
tighter, more strict FileCheck assertions. Some of these I really like
as they show case exactly what instruction sequences come out of these
microscopic functionality tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218936 91177308-0d34-0410-b5e6-96231b3b80d8
baseline for updates from the new vector shuffle lowering.
I've inspected the results here, and I couldn't find any register
allocation decisions where there should be any realistic way to register
allocate things differently. The closest was the imul test case. If you
see something here you'd like register number variables on, just shout
and I'll add them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218935 91177308-0d34-0410-b5e6-96231b3b80d8
need to be updated for the new vector shuffle lowering.
After talking to Adam Nemet, Tim Northover, etc., it seems that testing
MC encodings in the same suite as the basic codegen isn't the right
approach. Instead, we're going to want dedicated MC tests for the
encodings. These encodings are starting to get in my way so I wanted to
cut them out early. The total set of instructions that should have
encoding tests added is:
vpaddd
vsqrtss
vsqrtsd
vmovlhps
vmovhlps
valignq
vbroadcastss
Not too many parts of these tests were even using this. =]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218932 91177308-0d34-0410-b5e6-96231b3b80d8
No functional change intended.
Very similar to the change I made for subvector extract in r218480.
test/CodeGen/X86/avx512-insert-extract.ll covers this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218928 91177308-0d34-0410-b5e6-96231b3b80d8
No functional change.
Very similar to the extract refactoring I did in r218478.
Compared X86.td.expanded before and after.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218927 91177308-0d34-0410-b5e6-96231b3b80d8
Older Book-E cores, such as the PPC 440, support only msync (which has the same
encoding as sync 0), but not any of the other sync forms. Newer Book-E cores,
however, do support sync, and for performance reasons we should allow the use
of the more-general form.
This refactors msync use into its own feature group so that it applies by
default only to older Book-E cores (of the relevant cores, we only have
definitions for the PPC440/450 currently).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218923 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Atomic loads and store of up to the native size (32 bits, or 64 for PPC64)
can be lowered to a simple load or store instruction (as the synchronization
is already handled by AtomicExpand, and the atomicity is guaranteed thanks to
the alignment requirements of atomic accesses). This is exactly what this patch
does. Previously, these were implemented by complex
load-linked/store-conditional loops.. an obvious performance problem.
For example, this patch turns
```
define void @store_i8_unordered(i8* %mem) {
store atomic i8 42, i8* %mem unordered, align 1
ret void
}
```
from
```
_store_i8_unordered: ; @store_i8_unordered
; BB#0:
rlwinm r2, r3, 3, 27, 28
li r4, 42
xori r5, r2, 24
rlwinm r2, r3, 0, 0, 29
li r3, 255
slw r4, r4, r5
slw r3, r3, r5
and r4, r4, r3
LBB4_1: ; =>This Inner Loop Header: Depth=1
lwarx r5, 0, r2
andc r5, r5, r3
or r5, r4, r5
stwcx. r5, 0, r2
bne cr0, LBB4_1
; BB#2:
blr
```
into
```
_store_i8_unordered: ; @store_i8_unordered
; BB#0:
li r2, 42
stb r2, 0(r3)
blr
```
which looks like a pretty clear win to me.
Test Plan:
fixed the tests + new test for indexed accesses + make check-all
Reviewers: jfb, wschmidt, hfinkel
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D5587
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218922 91177308-0d34-0410-b5e6-96231b3b80d8
to be a ManagedStatic in r218163 to not be a global variable written and
read to from within the innards of SpillPlacement.
This will fix a really scary race condition for anyone that has two
copies of LLVM running spill placement concurrently. Yikes!
This will also fix a really significant compile time hit that r218163
caused because the spill placement threshold read is actually in the
*very* hot path of this code. The memory fence on each read was showing
up as huge compile time regressions when spilling is responsible for
most of the compile time. For example, optimizing sanitized code showed
over 50% compile time regressions here. =/
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218921 91177308-0d34-0410-b5e6-96231b3b80d8
Do not eliminate the frame pointer if there is a stackmap or patchpoint in the
function. All stackmap references should be FP relative.
This fixes PR21107.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218920 91177308-0d34-0410-b5e6-96231b3b80d8
This patch defines a new iterator for the imported symbols.
Make a change to COFFDumper to use that iterator to print
out imported symbols and its ordinals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218915 91177308-0d34-0410-b5e6-96231b3b80d8
This patch addresses the first stage of PR17891 by folding constant
arguments together into a single MDString. Integers are stringified and
a `\0` character is used as a separator.
Part of PR17891.
Note: I've attached my testcases upgrade scripts to the PR. If I've
just broken your out-of-tree testcases, they might help.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218914 91177308-0d34-0410-b5e6-96231b3b80d8
elements as well as integer elements in order to form simpler shuffle
patterns.
This is the primary reason why we were failing to match some of the
2-and-2 floating point shuffles such as PR21140. Even after fixing this
we need to support some extra patterns in the backend in order to match
the resulting X86ISD::UNPCKL nodes into the correct instructions. This
commit should fix PR21140 and includes more comprehensive testing of
insertion patterns in v4 shuffles.
Not all of the added tests are beautiful. For example, we don't have
clever instructions to insert-via-load in the integer domain. There are
also some places where we aren't sufficiently cunning with our use of
movq and movd, but that's future work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218911 91177308-0d34-0410-b5e6-96231b3b80d8
When unsafe-fp-math is enabled, we can turn sqrt(X) * sqrt(X) into X.
This can happen in the real world when calculating x ** 3/2. This occurs
in test-suite/SingleSource/Benchmarks/BenchmarkGame/n-body.c.
Differential Revision: http://reviews.llvm.org/D5584
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218906 91177308-0d34-0410-b5e6-96231b3b80d8
floating point and integer domains.
Merge the AVX2 test into it and add an extra RUN line. Generate clean
FileCheck statements with my script. Remove the now merged AVX2 tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218903 91177308-0d34-0410-b5e6-96231b3b80d8
Every time we were adding or removing an expression when generating a
coverage mapping we were doing a linear search to try and deduplicate
the list. The indices in the list are important, so we can't just
replace it by a DenseMap entirely, but an auxilliary DenseMap for fast
lookup massively improves the performance issues I was seeing here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218892 91177308-0d34-0410-b5e6-96231b3b80d8
When the flag is given, the command prints out the COFF import table.
Currently only the import table directory will be printed.
I'm going to make another patch to print out the imported symbols.
The implementation of import directory entry iterator in
COFFObjectFile.cpp was buggy. This patch fixes that too.
http://reviews.llvm.org/D5569
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218891 91177308-0d34-0410-b5e6-96231b3b80d8
When I was preparing r218879 for commit, I removed an early return
that I decided was just noise. It wasn't. This is r218879 no-crash
edition.
This reverts commit r218881, reapplying r218879.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218887 91177308-0d34-0410-b5e6-96231b3b80d8
The Terms vector here represented a polynomial of of all possible
counters, and is used to simplify expressions when generating coverage
mapping. There are a few problems with this:
1. Keeping the vector as a member is wasteful, since we clear it every
time we use it.
2. Most expressions refer to a subset of the counters, so we end up
iterating over a large number of zeros doing nothing a lot of the
time.
This updates the user of the vector to store the terms locally, and
uses a sort and combine approach so that we only operate on counters
that are actually used in a given expression. For small cases this
makes very little difference, but in cases with a very large number of
counted regions this is a significant performance fix.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218879 91177308-0d34-0410-b5e6-96231b3b80d8