with a vector condition); such selects become VSELECT codegen nodes.
This patch also removes VSETCC codegen nodes, unifying them with SETCC
nodes (codegen was actually often using SETCC for vector SETCC already).
This ensures that various DAG combiner optimizations kick in for vector
comparisons. Passes dragonegg bootstrap with no testsuite regressions
(nightly testsuite as well as "make check-all"). Patch mostly by
Nadav Rotem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139159 91177308-0d34-0410-b5e6-96231b3b80d8
lower XMM register gets in first. This will allow the SUBREG pattern to
elliminate the first vector insertion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@137310 91177308-0d34-0410-b5e6-96231b3b80d8
hasPredecessorHelper function allows predecessors to be cached to speed up
repeated invocations. This fixes PR10186.
X.isPredecessorOf(Y) now just calls Y.hasPredecessor(X)
Y.hasPredecessor(X) calls Y.hasPredecessorHelper(X, Visited, Worklist) with
empty Visited and Worklist sets (i.e. no caching over invocations).
Y.hasPredecessorHelper(X, Visited, Worklist) caches search state in Visited
and Worklist to speed up repeated calls. The Visited set is searched for X
before going to the worklist to further search the DAG if necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@134592 91177308-0d34-0410-b5e6-96231b3b80d8
1. (((x) & 0xFF00) >> 8) | (((x) & 0x00FF) << 8)
=> (bswap x) >> 16
2. ((x&0xff)<<8)|((x&0xff00)>>8)|((x&0xff000000)>>8)|((x&0x00ff0000)<<8))
=> (rotl (bswap x) 16)
This allows us to eliminate most of the def : Pat patterns for ARM rev16
revsh instructions. It catches many more cases for ARM and x86.
rdar://9609108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133503 91177308-0d34-0410-b5e6-96231b3b80d8
GetDemandBits (which must operate on the vector element type).
Fix the a usage of getZeroExtendInReg which must also be done on scalar types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133052 91177308-0d34-0410-b5e6-96231b3b80d8
converted to add x,x if x is a undef. add undef, undef does not guarantee
that the resulting low order bit is zero.
Fixes <rdar://problem/9453156> and <rdar://problem/9487392>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133022 91177308-0d34-0410-b5e6-96231b3b80d8
The potential DAGCombine which enforces this more generally messes up some other very fragile patterns, so I'm leaving that alone, at least for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@132809 91177308-0d34-0410-b5e6-96231b3b80d8
If there is a store after the load node, then there is a chain, which means
that there is another user. Thus, asking hasOneUser would fail. Instead we
ask hasNUsesOfValue on the 'data' value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@131183 91177308-0d34-0410-b5e6-96231b3b80d8
Returning a new node makes the code try to replace the old node, which
in the included testcase is killed by CSE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@129650 91177308-0d34-0410-b5e6-96231b3b80d8
transformations in target-specific DAG combines without causing DAGCombiner to
delete the same node twice. If you know of a better way to avoid this (see my
next patch for an example), please let me know.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@128758 91177308-0d34-0410-b5e6-96231b3b80d8
1. Inform users of ADDEs with two 0 operands that it never sets carry
2. Fold other ADDs or ADDCs into the ADDE if possible
It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126557 91177308-0d34-0410-b5e6-96231b3b80d8
Limit the folding of any_ext and sext into the load operation to scalars.
Limit the active-bits trunc optimization to scalars.
Document vector trunc and vector sext in LangRef.
Similar to commit 126080 (for enabling zext).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126424 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126080 91177308-0d34-0410-b5e6-96231b3b80d8
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.
Fixes: PR9223 and rdar://9000350
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125631 91177308-0d34-0410-b5e6-96231b3b80d8
generating i8 shift amounts for things like i1024 types. Add
an assert in getNode to prevent this from occuring in the future,
fix the buggy transformation, revert my previous patch, and
document this gotcha in ISDOpcodes.h
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125465 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner created illegal BUILD_VECTOR operations.
The patch added a check that either illegal operations are
allowed or that the created operation is legal.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125435 91177308-0d34-0410-b5e6-96231b3b80d8
The bug happens when the DAGCombiner attempts to optimize one of the patterns
of the SUB opcode. It tries to create a zero of type v2i64. This type is legal
on 32bit machines, but the initializer of this vector (i64) is target dependent.
Currently, the initializer attempts to create an i64 zero constant, which fails.
Added a flag to tell the DAGCombiner to create a legal zero, if we require that
the pass would generate legal types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125391 91177308-0d34-0410-b5e6-96231b3b80d8
the load, then it may be legal to transform the load and store to integer
load and store of the same width.
This is done if the target specified the transformation as profitable. e.g.
On arm, this can transform:
vldr.32 s0, []
vstr.32 s0, []
to
ldr r12, []
str r12, []
rdar://8944252
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124708 91177308-0d34-0410-b5e6-96231b3b80d8
This happens all the time when a smul is promoted to a larger type.
On x86-64 we now compile "int test(int x) { return x/10; }" into
movslq %edi, %rax
imulq $1717986919, %rax, %rax
movq %rax, %rcx
shrq $63, %rcx
sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax"
addl %ecx, %eax
This fires 96 times in gcc.c on x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124559 91177308-0d34-0410-b5e6-96231b3b80d8
This happens e.g. for code like "X - X%10" where we lower the modulo operation
to a series of multiplies and shifts that are then subtracted from X, leading to
this missed optimization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124532 91177308-0d34-0410-b5e6-96231b3b80d8
loads properly. We miscompiled the testcase into:
_test: ## @test
movl $128, (%rdi)
movzbl 1(%rdi), %eax
ret
Now we get a proper:
_test: ## @test
movl $128, (%rdi)
movsbl (%rdi), %eax
movzbl %ah, %eax
ret
This fixes PR8757.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122392 91177308-0d34-0410-b5e6-96231b3b80d8
the shift type was needed one place, the shift count
type another. The transform in 123555 had the same
problem.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122366 91177308-0d34-0410-b5e6-96231b3b80d8