dependencies between return values and/or arguments. Also make the handling of
arguments and return values the same.
The pass now looks properly inside returned structs, but only at the first
level (ie, not inside nested structs).
Also add a testcase for testing various variations of (multiple) dead rerturn
values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52459 91177308-0d34-0410-b5e6-96231b3b80d8
time. Sorry for the trouble!
This time, also add a testcase, which I should have done in the first place...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52455 91177308-0d34-0410-b5e6-96231b3b80d8
individually.
Also learn IPConstProp how returning first class aggregates work, in addition
to old style multiple return instructions.
Modify the return-constants testscase to confirm this behaviour.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52396 91177308-0d34-0410-b5e6-96231b3b80d8
when changing the stride of a comparison so that it's slightly
more precise, by having it scan the instruction list to determine
if there is a use of the condition after the point where the
condition will be inserted.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52371 91177308-0d34-0410-b5e6-96231b3b80d8
pointer derived from a local allocation, if the local allocation
never escapes, the pointers can't alias. This implements PR2436
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52301 91177308-0d34-0410-b5e6-96231b3b80d8
take into account the instrucion pointed by InsertPt. Thanks to it,
returning the new value of InsertPt to the InsertBinop() caller can be
avoided. The bug was, actually, in visitAddRecExpr() method which wasn't
correctly handling changes of InsertPt. There shouldn't be any
performance regression, as -gvn pass (run after -indvars) removes any
redundant binops.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52291 91177308-0d34-0410-b5e6-96231b3b80d8
wrong for volatile loads and stores. In fact this
is almost all of them! There are three types of
problems: (1) it is wrong to change the width of
a volatile memory access. These may be used to
do memory mapped i/o, in which case a load can have
an effect even if the result is not used. Consider
loading an i32 but only using the lower 8 bits. It
is wrong to change this into a load of an i8, because
you are no longer tickling the other three bytes. It
is also unwise to make a load/store wider. For
example, changing an i16 load into an i32 load is
wrong no matter how aligned things are, since the
fact of loading an additional 2 bytes can have
i/o side-effects. (2) it is wrong to change the
number of volatile load/stores: they may be counted
by the hardware. (3) it is wrong to change a volatile
load/store that requires one memory access into one
that requires several. For example on x86-32, you
can store a double in one processor operation, but to
store an i64 requires two (two i32 stores). In a
multi-threaded program you may want to bitcast an i64
to a double and store as a double because that will
occur atomically, and be indivisible to other threads.
So it would be wrong to convert the store-of-double
into a store of an i64, because this will become two
i32 stores - no longer atomic. My policy here is
to say that the number of processor operations for
an illegal operation is undefined. So it is alright
to change a store of an i64 (requires at least two
stores; but could be validly lowered to memcpy for
example) into a store of double (one processor op).
In short, if the new store is legal and has the same
size then I say that the transform is ok. It would
also be possible to say that transforms are always
ok if before they were illegal, whether after they
are illegal or not, but that's more awkward to do
and I doubt it buys us anything much.
However this exposed an interesting thing - on x86-32
a store of i64 is considered legal! That is because
operations are marked legal by default, regardless of
whether the type is legal or not. In some ways this
is clever: before type legalization this means that
operations on illegal types are considered legal;
after type legalization there are no illegal types
so now operations are only legal if they really are.
But I consider this to be too cunning for mere mortals.
Better to do things explicitly by testing AfterLegalize.
So I have changed things so that operations with illegal
types are considered illegal - indeed they can never
map to a machine operation. However this means that
the DAG combiner is more conservative because before
it was "accidentally" performing transforms where the
type was illegal because the operation was nonetheless
marked legal. So in a few such places I added a check
on AfterLegalize, which I suppose was actually just
forgotten before. This causes the DAG combiner to do
slightly more than it used to, which resulted in the X86
backend blowing up because it got a slightly surprising
node it wasn't expecting, so I tweaked it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52254 91177308-0d34-0410-b5e6-96231b3b80d8
with code that was expecting different bit widths for different values.
Make getTruncateOrZeroExtend a method on ScalarEvolution, and use it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52248 91177308-0d34-0410-b5e6-96231b3b80d8
variable expansions involving the $ character.
This fixes 4 tests that were not running properly before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52183 91177308-0d34-0410-b5e6-96231b3b80d8
cases quoting of <{ didn't work out, so I changed the grep to check for }>
instead.
This fixes 7 testcases that were not properly running before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52182 91177308-0d34-0410-b5e6-96231b3b80d8
Also, use > %t instead of -o %t for output in one test since that also works
when %t already exists.
This fixes 6 testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52178 91177308-0d34-0410-b5e6-96231b3b80d8
declarations. These are the fixes that I was pretty confident about, there are
still a lot of other llvm-gcc warnings of which I'm not sure if they can be
safely ignored or fixed, without breaking the test case.
This fixes 11 testcases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52176 91177308-0d34-0410-b5e6-96231b3b80d8
don't fail when (expected) error output is produced. This fixes 17 tests.
While I was there, I also made all RUN lines of the form "not llvm-as..." a bit
more consistent, they now all redirect stderr and stdout to /dev/null and use
input redirect to read their input.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52174 91177308-0d34-0410-b5e6-96231b3b80d8
tests. This breaks 80 tests in the tree.
The interesting part here is that this no longer ignores syntax errors
in RUN command lines. Some tests have not been working all the time because of
this.
The tricky part is that it now also views any stderr output as an error. This
can be suppressed in tcl 8.5, but let's not add this dependency. Instead, all
testcases should be changed to redirect stderr if they expect stderr output.
This holds in particular for lines like:
; RUN: not llvm-as < %s
where an error is expected (but I think I can solve this by modifying the not
script). Also, compilations resulting in warnings will now also fail (so
the warnings should be fixed, disabled or redirected...).
I'll continue with fixing the testcases that are broken now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52172 91177308-0d34-0410-b5e6-96231b3b80d8
types on functions, with adjustments so that it accepts both
new-style aggregate returns and old-style MRV returns, including those
with only a single member.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@52157 91177308-0d34-0410-b5e6-96231b3b80d8
work and how to replace them into individual values. Also, when trying to
replace an aggregrate that is used by load or store with a single (large)
integer, don't crash (but don't replace the aggregrate either).
Also adds a testcase for both structs and arrays.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51997 91177308-0d34-0410-b5e6-96231b3b80d8
are the same as in unpacked structs, only field
positions differ. This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment. Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done. I now
think this was a mistake. Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes. It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51928 91177308-0d34-0410-b5e6-96231b3b80d8
issue is operand promotion for setcc/select... but looks like the fundamental
stuff is implemented for CellSPU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51884 91177308-0d34-0410-b5e6-96231b3b80d8
and insertvalue and extractvalue instructions.
First-class array values are not trivial because C doesn't
support them. The approach I took here is to wrap all arrays
in structs. Feedback is welcome.
The 2007-01-15-NamedArrayType.ll test needed to be modified
because it has a "not grep" for a string that now exists,
because array types now have associated struct types, and
those struct types have names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51881 91177308-0d34-0410-b5e6-96231b3b80d8
in DAGISelEmitter output. This bug was recently uncovered by the
addition of patterns for CALL32m and CALL64m, which are nodes
that now have both MemOperands and variadic_ops.
This bug was especially visible with PIC in various configurations,
because the new patterns are matching the indirect call code used
in many PIC configurations.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51877 91177308-0d34-0410-b5e6-96231b3b80d8
is longer than the second one) should stop after finding one. Added break
instruction guarantees it. It also changes difference between offsets to
absolute value of this difference in the condition.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51875 91177308-0d34-0410-b5e6-96231b3b80d8
the conditions for performing the transform when only the
function declaration is available: no longer allow turning
i32 into i64 for example. Only allow changing between
pointer types, and between pointer types and integers of
the same size. For return values ptr -> intptr was already
allowed; I added ptr -> ptr and intptr -> ptr while there.
As shown by a recent objc testcase, changing the way
parameters/return values are passed can be fatal when calling
code written in assembler that directly manipulates call
arguments and return values unless the transform has no
impact on the way they are passed at the codegen level.
While it is possible to imagine an ABI that treats integers
of pointer size differently to pointers, I don't think LLVM
supports any so the transform should now be safe while still
being useful.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51834 91177308-0d34-0410-b5e6-96231b3b80d8
we did not truncate the value down to i1 with (x&1). This caused a problem
when the computation of x was nontrivial, for example, "add i1 1, 1" would
return 2 instead of 0.
This makes the testcase compile into:
...
llvm_cbe_t = (((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u))&1);
llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t));
...
instead of:
...
llvm_cbe_t = ((llvm_cbe_r == 0u) + (llvm_cbe_r == 0u));
llvm_cbe_u = (((unsigned int )(bool )llvm_cbe_t));
...
This fixes a miscompilation of mediabench/adpcm/rawdaudio/rawdaudio and
403.gcc with the CBE, regressions from LLVM 2.2. Tanya, please pull
this into the release branch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51813 91177308-0d34-0410-b5e6-96231b3b80d8
insertvalue and extractvalue to use constant indices instead of
Value* indices. And begin updating LangRef.html.
There's definately more to come here, but I'm checking this
basic support in now to make it available to people who are
interested.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51806 91177308-0d34-0410-b5e6-96231b3b80d8
cases due to an isel deficiency already noted in
lib/Target/X86/README.txt, but they can be matched in this fold-call.ll
testcase, for example.
This is interesting mainly because it exposes a tricky tblgen bug;
tblgen was incorrectly computing the starting index for variable_ops
in the case of a complex pattern.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51706 91177308-0d34-0410-b5e6-96231b3b80d8
the one case that ADCE catches that normal DCE doesn't: non-induction variable
loop computations.
This implementation handles this problem without using postdominators.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51668 91177308-0d34-0410-b5e6-96231b3b80d8
sometimes a "mov %ebp, %esp" in the epilogue.
Force these tests that rely on counting 'mov' to use i686-apple-darwin8.8.0
where they were written.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51568 91177308-0d34-0410-b5e6-96231b3b80d8
Analysis/ConstantFolding to fold ConstantExpr's, then make instcombine use it
to try to use targetdata to fold constant expressions on void instructions.
Also extend the icmp(inttoptr, inttoptr) folding to handle the case where
int size != ptr size.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51559 91177308-0d34-0410-b5e6-96231b3b80d8
The SimplifyCFG pass looks at basic blocks that contain only phi nodes,
followed by an unconditional branch. In a lot of cases, such a block (BB) can
be merged into their successor (Succ).
This merging is performed by TryToSimplifyUncondBranchFromEmptyBlock. It does
this by taking all phi nodes in the succesor block Succ and expanding them to
include the predecessors of BB. Furthermore, any phi nodes in BB are moved to
Succ and expanded to include the predecessors of Succ as well.
Before attempting this merge, CanPropagatePredecessorsForPHIs checks to see if
all phi nodes can be properly merged. All functional changes are made to
this function, only comments were updated in
TryToSimplifyUncondBranchFromEmptyBlock.
In the original code, CanPropagatePredecessorsForPHIs looks quite convoluted
and more like stack of checks added to handle different kinds of situations
than a comprehensive check. In particular the first check in the function did
some value checking for the case that BB and Succ have a common predecessor,
while the last check in the function simply rejected all cases where BB and
Succ have a common predecessor. The first check was still useful in the case
that BB did not contain any phi nodes at all, though, so it was not completely
useless.
Now, CanPropagatePredecessorsForPHIs is restructured to to look a lot more
similar to the code that actually performs the merge. Both functions now look
at the same phi nodes in about the same order. Any conflicts (phi nodes with
different values for the same source) that could arise from merging or moving
phi nodes are detected. If no conflicts are found, the merge can happen.
Apart from only restructuring the checks, two main changes in functionality
happened.
Firstly, the old code rejected blocks with common predecessors in most cases.
The new code performs some extra checks so common predecessors can be handled
in a lot of cases. Wherever common predecessors still pose problems, the
blocks are left untouched.
Secondly, the old code rejected the merge when values (phi nodes) from BB were
used in any other place than Succ. However, it does not seem that there is any
situation that would require this check. Even more, this can be proven.
Consider that BB is a block containing of a single phi node "%a" and a branch
to Succ. Now, since the definition of %a will dominate all of its uses, BB
will dominate all blocks that use %a. Furthermore, since the branch from BB to
Succ is unconditional, Succ will also dominate all uses of %a.
Now, assume that one predecessor of Succ is not dominated by BB (and thus not
dominated by Succ). Since at least one use of %a (but in reality all of them)
is reachable from Succ, you could end up at a use of %a without passing
through it's definition in BB (by coming from X through Succ). This is a
contradiction, meaning that our original assumption is wrong. Thus, all
predecessors of Succ must also be dominated by BB (and thus also by Succ).
This means that moving the phi node %a from BB to Succ does not pose any
problems when the two blocks are merged, and any use checks are not needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51478 91177308-0d34-0410-b5e6-96231b3b80d8
and/or to handle more cases (such as this add-sitofp.ll testcase), and
port it to selectiondag's ComputeNumSignBits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51469 91177308-0d34-0410-b5e6-96231b3b80d8
and bitcode support for the extractvalue and insertvalue
instructions and constant expressions.
Note that this does not yet include CodeGen support.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51468 91177308-0d34-0410-b5e6-96231b3b80d8
get inline asm working as well as it did previously with the CBE
with the new MRV support for inline asm.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51420 91177308-0d34-0410-b5e6-96231b3b80d8
BB1:
vr1025 = copy vr1024
..
BB2:
vr1024 = op
= op vr1025
<loop eventually branch back to BB1>
Even though vr1025 is copied from vr1024, it's not safe to coalesced them since live range of vr1025 intersects the def of vr1024. This happens when vr1025 is assigned the value of the previous iteration of vr1024 in the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51394 91177308-0d34-0410-b5e6-96231b3b80d8
If local spiller optimization turns some instruction into an identity copy, it will be removed. If the output register happens to be dead (and source is obviously killed), transfer the kill / dead information to last use / def in the same MBB.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51306 91177308-0d34-0410-b5e6-96231b3b80d8
to accurately represent the integer. This triggers 9 times in 471.omnetpp,
though 8 of those seem to be inlined from the same place.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51271 91177308-0d34-0410-b5e6-96231b3b80d8
type and the other operand is a constant into integer comparisons.
This happens surprisingly frequently (e.g. 10 times in 471.omnetpp),
which are things like this:
%tmp8283 = sitofp i32 %tmp82 to double
%tmp1013 = fcmp ult double %tmp8283, 0.0
Clearly comparing tmp82 against i32 0 is cheaper here.
this also triggers 8 times in gobmk, including this one:
%tmp375376 = sitofp i32 %tmp375 to double
%tmp377 = fcmp ogt double %tmp375376, 8.150000e+01
which is comparing an integer against 81.5 :).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51268 91177308-0d34-0410-b5e6-96231b3b80d8
intersecting bits. This triggers all over the place, for example in lencode,
with adds of stuff like:
%tmp580 = mul i32 %tmp579, 2
%tmp582 = and i32 %b8, 1
and
%tmp28 = shl i32 %abs.i, 1
%sign.0 = select i1 %tmp23, i32 1, i32 0
and
%tmp344 = shl i32 %tmp343, 2
%tmp346 = and i32 %tmp96, 3
etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51263 91177308-0d34-0410-b5e6-96231b3b80d8
whether or not -funit-at-a-time is used (C++ uses
it, C doesn't) - it was working before only when
not doing unit-at-a-time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@51258 91177308-0d34-0410-b5e6-96231b3b80d8