The stack realignment code was fixed to work when there is stack realignment and
a dynamic alloca is present so this shouldn't cause correctness issues anymore.
Note that this also enables generation of AVX instructions for memset
under the assumptions:
- Unaligned loads/stores are always fast on CPUs supporting AVX
- AVX is not slower than SSE
We may need some tweaked heuristics if one of those assumptions turns out not to
be true.
Effectively reverts r58317. Part of PR2962.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@167967 91177308-0d34-0410-b5e6-96231b3b80d8
for pre-2.9 bitcode files. We keep x86 unaligned loads, movnt, crc32, and the
target indep prefetch change.
As usual, updating the testsuite is a PITA.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@133337 91177308-0d34-0410-b5e6-96231b3b80d8
The theory is it's still faster than a pair of movq / a quad of movl. This
will probably hurt older chips like P4 but should run faster on current
and future Intel processors. rdar://8817010
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@122955 91177308-0d34-0410-b5e6-96231b3b80d8
memset; we may need it to decide between MOVAPS and MOVUPS
later. Adjust a test that was looking for wrong code.
PR 3866 / 8675131.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@119605 91177308-0d34-0410-b5e6-96231b3b80d8
1. Makes it possible to lower with floating point loads and stores.
2. Avoid unaligned loads / stores unless it's fast.
3. Fix some memcpy lowering logic bug related to when to optimize a
load from constant string into a constant.
4. Adjust x86 memcpy lowering threshold to make it more sane.
5. Fix x86 target hook so it uses vector and floating point memory
ops more effectively.
rdar://7774704
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@100090 91177308-0d34-0410-b5e6-96231b3b80d8
1. x86-64 byval alignment should be max of 8 and alignment of type. Previously the code was not doing what the commit message was saying.
2. Do not use byte repeat move and store operations. These are slow.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@55139 91177308-0d34-0410-b5e6-96231b3b80d8
memcpy lowering code; this ensures that the size node has the desired
result type. This fixes a regression from r49572 with @llvm.memcpy.i64
on x86-32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@49761 91177308-0d34-0410-b5e6-96231b3b80d8