llvm-6502

mirror of https://github.com/c64scene-ar/llvm-6502.git synced 2024-07-22 09:29:31 +00:00

Author	SHA1	Message	Date
Chandler Carruth	ae98867126	[x86] Teach the new v4i32 shuffle lowering some more tricks to recognize vzext patterns and insert-element patterns that for SSE4 have dedicated instructions. With this we can enable the experimental mode in a regression test that happens to cover some of the past set of issues. You can see that the new logic does significantly better here on the floating point cases. A follow-up to this change and the previous ones will hoist the logic into helpers so it can be shared across element type sizes as in this particular case it generalizes cleanly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217136 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 09:26:30 +00:00
Chandler Carruth	fa2dfaedf2	[x86] Teach the new vector shuffle lowering about the zero masking abilities of INSERTPS which are really powerful and come up in very important contexts such as forming diagonal matrices, etc. With this I ended up being able to remove the somewhat weird helper I added for INSERTPS because we can collapse the entire state to a no-op mask. Added a bunch of tests for inserting into a zero-ish vector. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217117 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-04 01:13:48 +00:00
Chandler Carruth	699fd1909e	[x86] Teach the new vector shuffle lowering about the simplest of 'insertps' patterns. This replaces two shuffles with a single insertps in very common cases. My next patch will extend this to leverage the zeroing capabilities of insertps which will allow it to be used in a much wider set of cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217100 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 22:48:34 +00:00
Chandler Carruth	36cf5d68be	[x86] Add an SSE4.1 mode to this test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217072 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 20:39:06 +00:00
Chandler Carruth	87508f1d87	[x86] Make this test check everything for both SSE2 and AVX1 modes, using a common 'all' prefix for the common test output. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217063 91177308-0d34-0410-b5e6-96231b3b80d8	2014-09-03 19:39:10 +00:00
Chandler Carruth	a3805f1c73	[x86] Teach lots of the new vector shuffle lowering to use UNPCK instructions for blend operations at 128 bits. This was a serious hole in our prior blend lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215819 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-16 09:42:15 +00:00
Chandler Carruth	886f0101a7	[x86] Fix the very broken formation of vpunpck instructions in the target-specific shuffl DAG combines. We were recognizing the paired shuffles backwards. This code needs to be replaced anyways as we have the same functionality elsewhere, but I'll do the refactoring in a follow-up, this is the minimal fix to the behavior. In addition to fixing miscompiles with the new vector shuffle lowering, it also causes the canonicalization to kick in much better, selecting the smaller encoding variants in lots of places in the new AVX path. This still isn't quite ideal as we don't need both the shufpd and the punpck instructions, but that'll get fixed in a follow-up patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-15 03:54:49 +00:00
Chandler Carruth	15d82b7d33	[x86] Fix a miscompile in the new shuffle lowering found through the new fuzz testing. The function which tested for adjacency did what it said on the tin, but when I called it, I wanted it to do something more thorough: I wanted to know if the pairs of shuffle elements were adjacent and started at 0 mod 2. In one place I had the decency to try to test for this, but in the other it was completely skipped, miscompiling this test case. Fix this by making the helper actually do what I wanted it to do everywhere I called it (and removing the now redundant code in one place). I really dislike the name "canWidenShuffleElements" for this predicate. If anyone can come up with a better name, please let me know. The other name I thought about was "canWidenShuffleMask" but is it really widening the mask to reduce the number of lanes shuffled? I don't know. Naming things is hard. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215089 91177308-0d34-0410-b5e6-96231b3b80d8	2014-08-07 08:11:31 +00:00
Chandler Carruth	63195d7e5a	[x86] Fix another bug hit when bootstrapping with the new shuffle lowering. For maximum irony, I had already discovered this bug, diagnosed it, and left FIXMEs about it in the test cases. =[ I just failed to go back over those until after i had reduced a bootstrap miscompile down to a single TU, stared at the assembly for an hour, and figured out the bug. Again. Oh well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211955 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 20:07:40 +00:00
Chandler Carruth	050d187bc8	[x86] Begin a significant overhaul of how vector lowering is done in the x86 backend. This sketches out a new code path for vector lowering, hidden behind an off-by-default flag while it is under development. The fundamental idea behind the new code path is to aggressively break down the problem space in ways that ease selecting the odd set of instructions available on x86, and carefully avoid scalarizing code even when forced to use older ISAs. Notably, this starts off restricting itself to SSE2 and implements the complete vector shuffle and blend space for 128-bit vectors in SSE2 without scalarizing. The plan is to layer on top of this ISA extensions where we can bail out of the complex SSE2 lowering and opt for a cheaper, specialized instruction (or set of instructions). It also needs to be generalized to AVX and AVX512 vector widths. Currently, this does a decent but not perfect job for SSE2. There are some specific shortcomings that I plan to address: - We need a peephole combine to fold together shuffles where possible. There are cases where a previous shuffle could be modified slightly to arrange for elements to be in the correct position and a later shuffle eliminated. Doing this eagerly added quite a bit of complexity, and so my plan is to combine away these redundancies afterward. - There are a lot more clever ways to use unpck and pack that need to be added. This is essential for real world shuffles as it turns out... Once SSE2 is polished a bit I should be able to get interesting numbers on performance improvements on benchmarks conducive to vectorization. All of this will be off by default until it is functionally equivalent of course. Differential Revision: http://reviews.llvm.org/D4225 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@211888 91177308-0d34-0410-b5e6-96231b3b80d8	2014-06-27 11:23:44 +00:00

10 Commits