llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2025-01-28 06:32:09 +00:00

Go to file

Chandler Carruth 4ea3097d08 [x86] Teach the vector shuffle lowering to make a more nuanced decision

between splitting a vector into 128-bit lanes and recombining them vs.
decomposing things into single-input shuffles and a final blend.

This handles a large number of cases in AVX1 where the cross-lane
shuffles would be much more expensive to represent even though we end up
with a fast blend at the root. Instead, we can do a better job of
shuffling in a single lane and then inserting it into the other lanes.

This fixes the remaining bits of Halide's regression captured in PR21281
for AVX1. However, the bug persists in AVX2 because I've made this
change reasonably conservative. The cases where it makes sense in AVX2
to split into 128-bit lanes are much more rare because we can often do
full permutations across all elements of the 256-bit vector. However,
the particular test case in PR21281 is an example of one of the rare
cases where it is *always* better to work in a single 128-bit lane. I'm
going to try to teach the logic to detect and form the good code even in
AVX2 next, but it will need to use a separate heuristic.

Finally, there is one pesky regression here where we previously would
craftily use vpermilps in AVX1 to shuffle both high and low halves at
the same time. We no longer pull that off, and not for any really good
reason. Ultimately, I think this is just another missing nuance to the
selection heuristic that I'll try to add in afterward, but this change
already seems strictly worth doing considering the magnitude of the
improvements in common matrix math shuffle patterns.

As always, please let me know if this causes a surprising regression for
you.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@221861 91177308-0d34-0410-b5e6-96231b3b80d8

2014-11-13 04:06:10 +00:00

autoconf

Add a check for misbehaving -Wcomment from gcc-4.7 and add

2014-11-05 00:35:15 +00:00

bindings

[OCaml] Fix mismatched CAMLparam/CAMLreturn.

2014-11-03 11:47:14 +00:00

cmake

Pass PRIVATE to target_link_libraries if using shared libraries.

2014-11-07 15:33:56 +00:00

docs

configure.ac lives in autoconf/, not autotools/

2014-11-10 22:36:04 +00:00

examples

[CMake] llvm/examples: Update libdeps for unoptimized builds.

2014-10-31 15:27:16 +00:00

include

llvm-readobj: Print out address table when dumping COFF delay-import table

2014-11-13 03:22:54 +00:00

lib

[x86] Teach the vector shuffle lowering to make a more nuanced decision

2014-11-13 04:06:10 +00:00

projects

…

test

[x86] Teach the vector shuffle lowering to make a more nuanced decision

2014-11-13 04:06:10 +00:00

tools

llvm-readobj: Print out address table when dumping COFF delay-import table

2014-11-13 03:22:54 +00:00

unittests

Drop a few unneeded ctor calls (missed code review comment).

2014-11-13 00:36:34 +00:00

utils

Make TreePattern::error use Twine

2014-11-11 23:48:11 +00:00

.arcconfig

…

.clang-format

…

.clang-tidy

Enable display of compiler diagnostics in clang-tidy by default.

2014-10-29 17:29:38 +00:00

.gitignore

Initial version of Go bindings.

2014-10-16 22:48:02 +00:00

CMakeLists.txt

Pass PRIVATE to target_link_libraries if using shared libraries.

2014-11-07 15:33:56 +00:00

CODE_OWNERS.TXT

Add Tom Stellard's role as 3.5 release manager.

2014-09-12 08:07:31 +00:00

configure

Add a check for misbehaving -Wcomment from gcc-4.7 and add

2014-11-05 00:35:15 +00:00

CREDITS.TXT

Rise from the dead and update personal info

2014-08-25 17:51:04 +00:00

LICENSE.TXT

…

llvm.spec.in

…

LLVMBuild.txt

…

Makefile

…

Makefile.common

…

Makefile.config.in

Add a check for misbehaving -Wcomment from gcc-4.7 and add

2014-11-05 00:35:15 +00:00

Makefile.rules

Add a check for misbehaving -Wcomment from gcc-4.7 and add

2014-11-05 00:35:15 +00:00

README.txt

…

README.txt

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for the Low Level
Virtual Machine, a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you're writing a package for LLVM, see docs/Packaging.rst for our
suggestions.

Languages

C++ 48.7%

LLVM 38.5%

Assembly 10.2%

C 0.9%

Python 0.4%

Other 1.2%