testcases accordingly. Some are currently xfailed and will be filed
as bugs to be fixed or understood.
Performance results:
roughly neutral on SPEC
some micro benchmarks in the llvm suite are up between 100 and 150%, only
a pair of regressions that are due to be investigated
john-the-ripper saw:
10% improvement in traditional DES
8% improvement in BSDI DES
59% improvement in FreeBSD MD5
67% improvement in OpenBSD Blowfish
14% improvement in LM DES
Small compile time impact.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127208 91177308-0d34-0410-b5e6-96231b3b80d8
bitcasts, which are really no-ops here. This fixes slowdowns on
MultiSource/Applications/aha and others.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@127031 91177308-0d34-0410-b5e6-96231b3b80d8
missing patterns for them.
Add a SIMD test subdirectory to hold tests for SIMD instruction
selection correctness and quality.
'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126845 91177308-0d34-0410-b5e6-96231b3b80d8
It improves Win64's prologue/epilogue but it would not affect ia32 and amd64 (lack of nonvolatile XMMs).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126568 91177308-0d34-0410-b5e6-96231b3b80d8
1. Inform users of ADDEs with two 0 operands that it never sets carry
2. Fold other ADDs or ADDCs into the ADDE if possible
It would be neat if we could do the same thing for SETCC+ADD eventually, but we can't do that in target independent code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126557 91177308-0d34-0410-b5e6-96231b3b80d8
Limit the folding of any_ext and sext into the load operation to scalars.
Limit the active-bits trunc optimization to scalars.
Document vector trunc and vector sext in LangRef.
Similar to commit 126080 (for enabling zext).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126424 91177308-0d34-0410-b5e6-96231b3b80d8
registers at phis. This enables us to eliminate a lot of pointless zexts during
the DAGCombine phase. This fixes <rdar://problem/8760114>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126380 91177308-0d34-0410-b5e6-96231b3b80d8
at phis. This enables us to eliminate a lot of pointless zexts during the DAGCombine
phase. This fixes <rdar://problem/8760114>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126170 91177308-0d34-0410-b5e6-96231b3b80d8
In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body.
This requires some coordination with debugger to get this working.
- The debugger needs to be aware of prolog_end attribute attached with line table entries.
- The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126155 91177308-0d34-0410-b5e6-96231b3b80d8
"dllimport" function must not be GlobalVariable, but Function. It is enough to check with GlobalValue.
test/CodeGen/X86/dll-linkage.ll is updated to check llc -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126110 91177308-0d34-0410-b5e6-96231b3b80d8
of a constant had a minor typo introduced when copying it from the book, which
caused it to favor negative approximations over positive approximations in many
cases. Positive approximations require fewer operations beyond the multiplication.
In the case of division by 3, we still generate code that is a single instruction
larger than GCC's code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126097 91177308-0d34-0410-b5e6-96231b3b80d8
test for that. With this change, test/CodeGen/X86/codegen-dce.ll no longer finds
any instructions to DCE, so delete the test.
Also renamed J and JP to I and IP in RecursivelyDeleteDeadPHINode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126088 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner folds the zext into complex load instructions. This patch
prevents this optimization on vectors since none of the supported targets
knows how to perform load+vector_zext in one instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@126080 91177308-0d34-0410-b5e6-96231b3b80d8
No one uses *-mingw64. mingw-w64 is represented as {i686|x86_64}-w64-mingw32. In llvm side, i686 and x64 can be treated as similar way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125747 91177308-0d34-0410-b5e6-96231b3b80d8
transformation if we can't legally create a build vector of the correct
type. Check that we can make the transformation first, and add a TODO to
refactor this code with similar cases.
Fixes: PR9223 and rdar://9000350
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125631 91177308-0d34-0410-b5e6-96231b3b80d8
Machine instruction range consisting of only DBG_VALUE MIs only contributes consecutive labels in assembly output, which is harmless, and empty scope entry in DebugInfo, which confuses debugger tools.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125577 91177308-0d34-0410-b5e6-96231b3b80d8
have their low bits set to zero. This allows us to optimize
out explicit stack alignment code like in stack-align.ll:test4 when
it is redundant.
Doing this causes the code generator to start turning FI+cst into
FI|cst all over the place, which is general goodness (that is the
canonical form) except that various pieces of the code generator
don't handle OR aggressively. Fix this by introducing a new
SelectionDAG::isBaseWithConstantOffset predicate, and using it
in places that are looking for ADD(X,CST). The ARM backend in
particular was missing a lot of addressing mode folding opportunities
around OR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125470 91177308-0d34-0410-b5e6-96231b3b80d8
the shift amounts are in a suitably wide type so that
we don't generate out of range constant shift amounts.
This fixes PR9028.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@125458 91177308-0d34-0410-b5e6-96231b3b80d8
Reversing the operands allows us to fold, but doesn't force us to. Also, at
this point the DAG is still being optimized, so the check for hasOneUse is not
very precise.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124773 91177308-0d34-0410-b5e6-96231b3b80d8
This happens all the time when a smul is promoted to a larger type.
On x86-64 we now compile "int test(int x) { return x/10; }" into
movslq %edi, %rax
imulq $1717986919, %rax, %rax
movq %rax, %rcx
shrq $63, %rcx
sarq $34, %rax <- used to be "shrq $32, %rax; sarl $2, %eax"
addl %ecx, %eax
This fires 96 times in gcc.c on x86-64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@124559 91177308-0d34-0410-b5e6-96231b3b80d8
to add/sub by doing the normal operation and then checking for overflow
afterwards. This generally relies on the DAG handling the later invalid
operations as well.
Fixes the 64-bit part of rdar://8622122 and rdar://8774702.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123908 91177308-0d34-0410-b5e6-96231b3b80d8
This shaves off 4 popcounts from the hacked 186.crafty source.
This is enabled even when a native popcount instruction is available. The
combined code is one operation longer but it should be faster nevertheless.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123621 91177308-0d34-0410-b5e6-96231b3b80d8
into and/shift would cause nodes to move around and a dangling pointer
to happen. The code tried to avoid this with a HandleSDNode, but
got the details wrong.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@123578 91177308-0d34-0410-b5e6-96231b3b80d8