and the spill is its kill. However, if the local allocator has determined the
register has not been modified (possible when its value was reloaded), it would
not issue a restore. In that case, mark the last use of the virtual register as
kill.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46111 91177308-0d34-0410-b5e6-96231b3b80d8
It's not safe to use the two value CombineTo variant to combine away a dead load.
e.g.
v1, chain2 = load chain1, loc
v2, chain3 = load chain2, loc
v3 = add v2, c
Now we replace use of v1 with undef, use of chain2 with chain1.
ReplaceAllUsesWith() will iterate through uses of the first load and update operands:
v1, chain2 = load chain1, loc
v2, chain3 = load chain1, loc
v3 = add v2, c
Now the second load is the same as the first load, SelectionDAG cse will ensure
the use of second load is replaced with the first load.
v1, chain2 = load chain1, loc
v3 = add v1, c
Then v1 is replaced with undef and bad things happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46099 91177308-0d34-0410-b5e6-96231b3b80d8
it should work, but I have no machine to test
it on. Committed because it will at least
cause no harm, and maybe someone can test it
for me!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46098 91177308-0d34-0410-b5e6-96231b3b80d8
make the 'fp return in ST(0)' optimization smart enough to
look through token factor nodes. THis allows us to compile
testcases like CodeGen/X86/fp-stack-retcopy.ll into:
_carg:
subl $12, %esp
call L_foo$stub
fstpl (%esp)
fldl (%esp)
addl $12, %esp
ret
instead of:
_carg:
subl $28, %esp
call L_foo$stub
fstpl 16(%esp)
movsd 16(%esp), %xmm0
movsd %xmm0, 8(%esp)
fldl 8(%esp)
addl $28, %esp
ret
Still not optimal, but much better and this is a trivial patch. Fixing
the rest requires invasive surgery that is is not llvm 2.2 material.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@46054 91177308-0d34-0410-b5e6-96231b3b80d8
- struct_2.ll: Completely unaligned load/store testing
- call_indirect.ll, struct_1.ll: Add test lines to exercise
X-form [$reg($reg)] addressing
At this point, loads and stores should be under control (he says
in an optimistic tone of voice.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45882 91177308-0d34-0410-b5e6-96231b3b80d8
- Cleaned up custom load/store logic, common code is now shared [see note
below], cleaned up address modes
- More test cases: various intrinsics, structure element access (load/store
test), updated target data strings, indirect function calls.
Note: This patch contains a refactoring of the LoadSDNode and StoreSDNode
structures: they now share a common base class, LSBaseSDNode, that
provides an interface to their common functionality. There is some hackery
to access the proper operand depending on the derived class; otherwise,
to do a proper job would require finding and rearranging the SDOperands
sent to StoreSDNode's constructor. The current refactor errs on the
side of being conservatively and backwardly compatible while providing
functionality that reduces redundant code for targets where loads and
stores are custom-lowered.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45851 91177308-0d34-0410-b5e6-96231b3b80d8
Likewise fix up a bunch of other libcalls. While
there I remove NEG_F32 and NEG_F64 since they are
not used anywhere. This fixes 9 Ada ACATS failures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45833 91177308-0d34-0410-b5e6-96231b3b80d8
the code generated is not wonderful. This turns a miscompilation into
a code quality bug (noted in the ppc readme). This fixes PR642, which
is over 2 years old (!). Nate, please review this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45742 91177308-0d34-0410-b5e6-96231b3b80d8
providing a misleading facility. It's used once in the MIPS backend
and hardcoded as "\t.globl\t" everywhere else.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45676 91177308-0d34-0410-b5e6-96231b3b80d8
values, which means doing extra legalization work.
It would be easier to get this kind of thing right if
there was some documentation...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45472 91177308-0d34-0410-b5e6-96231b3b80d8
eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn
may be done (if shufps is better than pinsw, Evan, please review), and
we already know about LICM of simple instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45407 91177308-0d34-0410-b5e6-96231b3b80d8
if we are just going to store it back anyway. This improves things
like:
double foo();
void bar(double *P) { *P = foo(); }
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45399 91177308-0d34-0410-b5e6-96231b3b80d8
define void @f() {
...
call i32 @g()
...
}
define void @g() {
...
}
The hazards are:
- @f and @g have GC, but they differ GC. Inlining is invalid. This
may never occur.
- @f has no GC, but @g does. g's GC must be propagated to @f.
The other scenarios are safe:
- @f and @g have the same GC.
- @f and @g have no GC.
- @g has no GC.
This patch adds inliner checks for the former two scenarios.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45351 91177308-0d34-0410-b5e6-96231b3b80d8
function with GC.
This will catch the error when the inliner inlines a function with
GC into a caller with no GC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45350 91177308-0d34-0410-b5e6-96231b3b80d8
how to lower them (with no attempt made to be
efficient, since they should only occur for
unoptimized code).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@45108 91177308-0d34-0410-b5e6-96231b3b80d8
SelectionDAG::getConstant, in the same way as vector floating-point
constants. This allows the legalize expansion code for @llvm.ctpop and
friends to be usable with vector types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44954 91177308-0d34-0410-b5e6-96231b3b80d8
possible before resorting to pextrw and pinsrw.
- Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles.
- Improves (i16 extract_vector_element 0) codegen by recognizing
(i32 extract_vector_element 0) does not require a pextrw.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44836 91177308-0d34-0410-b5e6-96231b3b80d8
methods are new to Function:
bool hasCollector() const;
const std::string &getCollector() const;
void setCollector(const std::string &);
void clearCollector();
The assembly representation is as such:
define void @f() gc "shadow-stack" { ...
The implementation uses an on-the-side table to map Functions to
collector names, such that there is no overhead. A StringPool is
further used to unique collector names, which are extremely
likely to be unique per process.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44769 91177308-0d34-0410-b5e6-96231b3b80d8
_foo:
movl $12, %eax
andl 4(%esp), %eax
movl _array(%eax), %eax
ret
instead of:
_foo:
movl 4(%esp), %eax
shrl $2, %eax
andl $3, %eax
movl _array(,%eax,4), %eax
ret
As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:
- movl 8(%eax), %eax
- shll $2, %eax
- andl $1020, %eax
- movl (%esi,%eax), %eax
+ movzbl 8(%eax), %eax
+ movl (%esi,%eax,4), %eax
- shll $2, %edx
- andl $1020, %edx
- movl (%edi,%edx), %edx
+ andl $255, %edx
+ movl (%edi,%edx,4), %edx
Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:
- andl $85, %ebx
- addl _bit_count(,%ebx,4), %ebp
+ shll $2, %ebx
+ andl $340, %ebx
+ addl _bit_count(%ebx), %ebp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44656 91177308-0d34-0410-b5e6-96231b3b80d8
optimized. This avoids creating illegal divisions when the combiner is
running after legalize; this fixes PR1815. Also, it produces better
code in the included testcase by avoiding the subtract and multiply
when the division isn't optimized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44341 91177308-0d34-0410-b5e6-96231b3b80d8
sometimes emit "zero" and "all one" vectors multiple times,
for example:
_test2:
pcmpeqd %mm0, %mm0
movq %mm0, _M1
pcmpeqd %mm0, %mm0
movq %mm0, _M2
ret
instead of:
_test2:
pcmpeqd %mm0, %mm0
movq %mm0, _M1
movq %mm0, _M2
ret
This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type. This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.
This patch makes the following changes:
1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
immAllZerosV in the wrong form now use *_bc to match them with a
bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle
bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
is legal, instead of generating one that is illegal and expecting
a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.
This patch is definite goodness, but has the potential to cause random
code quality regressions. Please be on the lookout for these and let
me know if they happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44310 91177308-0d34-0410-b5e6-96231b3b80d8
node A gets back into the DAG again because it was hiding in
one of the node maps: make sure that node replacement happens
in those maps too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@44263 91177308-0d34-0410-b5e6-96231b3b80d8
can be eliminated by the allocator is the destination and source targets the
same register. The most common case is when the source and destination registers
are in different class. For example, on x86 mov32to32_ targets GR32_ which
contains a subset of the registers in GR32.
The allocator can do 2 things:
1. Set the preferred allocation for the destination of a copy to that of its source.
2. After allocation is done, change the allocation of a copy destination (if
legal) so the copy can be eliminated.
This eliminates 443 extra moves from 403.gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43662 91177308-0d34-0410-b5e6-96231b3b80d8
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43465 91177308-0d34-0410-b5e6-96231b3b80d8
b/h/w/k/q inline asm memory modifiers, which are just ignored. This fixes
PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43430 91177308-0d34-0410-b5e6-96231b3b80d8
FE.
- Explicitly pass in the alignment of the load & store.
- XFAIL 2007-10-23-UnalignedMemcpy.ll because llc has a bug that crashes on
unaligned pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43398 91177308-0d34-0410-b5e6-96231b3b80d8
and the compaison is against a constant value, try eliminate the stride
by moving the compare instruction to another stride and change its
constant operand accordingly. e.g.
loop:
...
v1 = v1 + 3
v2 = v2 + 1
if (v2 < 10) goto loop
=>
loop:
...
v1 = v1 + 3
if (v1 < 30) goto loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43336 91177308-0d34-0410-b5e6-96231b3b80d8
- Avoid attempting stride-reuse in the case that there are users that
aren't addresses. In that case, there will be places where the
multiplications won't be folded away, so it's better to try to
strength-reduce them.
- Several SSE intrinsics have operands that strength-reduction can
treat as addresses. The previous item makes this more visible, as
any non-address use of an IV can inhibit stride-reuse.
- Make ValidStride aware of whether there's likely to be a base
register in the address computation. This prevents it from thinking
that things like stride 9 are valid on x86 when the base register is
already occupied.
Also, XFAIL the 2007-08-10-LEA16Use32.ll test; the new logic to avoid
stride-reuse elimintes the LEA in the loop, so the test is no longer
testing what it was intended to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43231 91177308-0d34-0410-b5e6-96231b3b80d8
To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset. I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@43172 91177308-0d34-0410-b5e6-96231b3b80d8
enabled by passing -tailcallopt to llc. The optimization is
performed if the following conditions are satisfied:
* caller/callee are fastcc
* elf/pic is disabled OR
elf/pic enabled + callee is in module + callee has
visibility protected or hidden
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42870 91177308-0d34-0410-b5e6-96231b3b80d8
basic arithmetic works.
Rename RTLIB long double functions to distinguish
different flavors of long double; the lib functions
have different names, alas.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42644 91177308-0d34-0410-b5e6-96231b3b80d8
The only generated code difference is that now we call memcpy when
the size of the array is unknown. This matches GCC behavior and is
better since the run time value can be arbitrarily large.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42433 91177308-0d34-0410-b5e6-96231b3b80d8
both results with a single div or idiv instruction. This uses new X86ISD
nodes for DIV and IDIV which are introduced during the legalize phase
so that the SelectionDAG's CSE can automatically eliminate redundant
computations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42308 91177308-0d34-0410-b5e6-96231b3b80d8
"_-[NSString(local) isNullOrNil]".eh = 0
.no_dead_strip "_-[NSString(local) isNullOrNil]".eh
The ".eh" should be inside the quotes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42074 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM now enforces the following prototypes for the write barriers:
<ty>* @llvm.gcread(<ty2>*, <ty>**)
void @llvm.gcwrite(<ty>*, <ty2>*, <ty>**)
And for @llvm.gcroot, the first stack slot is verified to be an alloca or a
bitcast of an alloca.
Fixes test/CodeGen/Generic/GC/lower_gcroot.ll, which violated these.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@42051 91177308-0d34-0410-b5e6-96231b3b80d8