arrays with variable indices into a comparison of the index
with a constant. The most common occurrence of this that
I see by far is stuff like:
if ("foobar"[i] == '\0') ...
which we compile into: if (i == 6), saving a load and
materialization of the global address. This also exposes
loop trip count information to later passes in many cases.
This triggers hundreds of times in xalancbmk, which is where I first
noticed it, but it also triggers in many other apps. Here are a few
interesting ones from various apps:
@must_be_connected_without = internal constant [8 x i8*] [i8* getelementptr inbounds ([3 x i8]* @.str64320, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str27283, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str71327, i64 0, i64 0), i8* getelementptr inbounds ([4 x i8]* @.str72328, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str18274, i64 0, i64 0), i8* getelementptr inbounds ([6 x i8]* @.str11267, i64 0, i64 0), i8* getelementptr inbounds ([3 x i8]* @.str32288, i64 0, i64 0), i8* null], align 32 ; <[8 x i8*]*> [#uses=2]
%scevgep.i = getelementptr [8 x i8*]* @must_be_connected_without, i64 0, i64 %indvar.i ; <i8**> [#uses=1]
%17 = load ...
%18 = icmp eq i8* %17, null ; <i1> [#uses=1]
-> icmp eq i64 %indvar.i, 7
@yytable1095 = internal constant [84 x i8] c"\12\01(\05\06\07\08\09\0A\0B\0C\0D\0E1\0F\10\11266\1D: \10\11,-,0\03'\10\11B6\04\17&\18\1945\05\06\07\08\09\0A\0B\0C\0D\0E\1E\0F\10\11*\1A\1B\1C$3+>#%;<IJ=ADFEGH9KL\00\00\00C", align 32 ; <[84 x i8]*> [#uses=2]
%57 = getelementptr inbounds [84 x i8]* @yytable1095, i64 0, i64 %56 ; <i8*> [#uses=1]
%mode.0.in = getelementptr inbounds [9 x i32]* @mb_mode_table, i64 0, i64 %.pn ; <i32*> [#uses=1]
load ...
%64 = icmp eq i8 %58, 4 ; <i1> [#uses=1]
-> icmp eq i64 %.pn, 35 ; <i1> [#uses=0]
@gsm_DLB = internal constant [4 x i16] [i16 6554, i16 16384, i16 26214, i16 32767]
%scevgep.i = getelementptr [4 x i16]* @gsm_DLB, i64 0, i64 %indvar.i ; <i16*> [#uses=1]
%425 = load %scevgep.i
%426 = icmp eq i16 %425, -32768 ; <i1> [#uses=0]
-> false
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92411 91177308-0d34-0410-b5e6-96231b3b80d8
pointer to int casts that confuse later optimizations. See PR3351
for details.
This improves but doesn't complete fix 483.xalancbmk because llvm-gcc
does this xform in GCC's "fold" routine as well. Clang++ will do
better I guess.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92408 91177308-0d34-0410-b5e6-96231b3b80d8
(X != null) | (Y != null) --> (X|Y) != 0
(X == null) & (Y == null) --> (X|Y) == 0
so that instcombine can stop doing this for pointers. This is part of PR3351,
which is a case where instcombine doing this for pointers (inserting ptrtoint)
is pessimizing code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92406 91177308-0d34-0410-b5e6-96231b3b80d8
a constantexpr gep on the 'base' side of the expression.
This completes comment #4 in PR3351, which comes from
483.xalancbmk.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92402 91177308-0d34-0410-b5e6-96231b3b80d8
multiply sequence when the power is a constant integer. Before, our
codegen for std::pow(.., int) always turned into a libcall, which was
really inefficient.
This should also make many gfortran programs happier I'd imagine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92388 91177308-0d34-0410-b5e6-96231b3b80d8
positive and negative forms of constants together. This
allows us to compile:
int foo(int x, int y) {
return (x-y) + (x-y) + (x-y);
}
into:
_foo: ## @foo
subl %esi, %edi
leal (%rdi,%rdi,2), %eax
ret
instead of (where the 3 and -3 were not factored):
_foo:
imull $-3, 8(%esp), %ecx
imull $3, 4(%esp), %eax
addl %ecx, %eax
ret
this started out as:
movl 12(%ebp), %ecx
imull $3, 8(%ebp), %eax
subl %ecx, %eax
subl %ecx, %eax
subl %ecx, %eax
ret
This comes from PR5359.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92381 91177308-0d34-0410-b5e6-96231b3b80d8
compare. On other targets we end up with a call to memcmp because we don't
want 16 individual byte loads. We should be able to use movups as well, but
we're failing to select the generated icmp.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92107 91177308-0d34-0410-b5e6-96231b3b80d8
SDISel. This optimization was causing simplifylibcalls to
introduce type-unsafe nastiness. This is the first step, I'll be
expanding the memcmp optimizations shortly, covering things that
we really really wouldn't want simplifylibcalls to do.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@92098 91177308-0d34-0410-b5e6-96231b3b80d8
missing check that an array reference doesn't go past the end of the array,
and remove some redundant checks for in-bound array and vector references
that are no longer needed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91897 91177308-0d34-0410-b5e6-96231b3b80d8
data on them, for example:
addb %al, (%rax)
simple-tests.txt:11:5: error: excess data detected in input
0 0 0 0 0
^
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91896 91177308-0d34-0410-b5e6-96231b3b80d8
by merging all returns in a function into a single one, but simplifycfg
currently likes to duplicate the return (an unfortunate choice!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91890 91177308-0d34-0410-b5e6-96231b3b80d8
'GetValueInMiddleOfBlock' case, instead of inserting
duplicates.
A similar fix is almost certainly needed by the machine-level
SSAUpdate implementation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91820 91177308-0d34-0410-b5e6-96231b3b80d8
implement some optimizations for MIN(MIN()) and MAX(MAX()) and
MIN(MAX()) etc. This substantially improves the code in PR5822 but
doesn't kick in much elsewhere. 2 max's were optimized in
pairlocalalign and one in smg2000.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91814 91177308-0d34-0410-b5e6-96231b3b80d8
Use the presence of NSW/NUW to fold "icmp (x+cst), x" to a constant in
cases where it would otherwise be undefined behavior.
Surprisingly (to me at least), this triggers hundreds of the times in
a few benchmarks: lencode, ldecode, and 466.h264ref seem to *really*
like this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91812 91177308-0d34-0410-b5e6-96231b3b80d8
a bunch in lencode, ldecod, spass, 176.gcc, 252.eon, among others. It is
also the first part of PR5822
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91811 91177308-0d34-0410-b5e6-96231b3b80d8
cache a pointer as being unavailable due to phi trans in the
wrong place. This would cause later queries to fail even when
they didn't involve phi trans.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91787 91177308-0d34-0410-b5e6-96231b3b80d8
where instcombine would have to split a critical edge due to a
phi node of an invoke. Since instcombine can't change the CFG,
it has to bail out from doing the transformation.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91763 91177308-0d34-0410-b5e6-96231b3b80d8
bootstrap. This also replaces the WeakVH references that Chris objected to
with normal Value references.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91711 91177308-0d34-0410-b5e6-96231b3b80d8
be non-optimal. To be precise, we should avoid folding loads if the instructions
only update part of the destination register, and the non-updated part is not
needed. e.g. cvtss2sd, sqrtss. Unfolding the load from these instructions breaks
the partial register dependency and it can improve performance. e.g.
movss (%rdi), %xmm0
cvtss2sd %xmm0, %xmm0
instead of
cvtss2sd (%rdi), %xmm0
An alternative method to break dependency is to clear the register first. e.g.
xorps %xmm0, %xmm0
cvtss2sd (%rdi), %xmm0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91672 91177308-0d34-0410-b5e6-96231b3b80d8
The change in SelectionDAGBuilder is needed to allow using bitcasts to convert
between f64 (the default type for ARM "d" registers) and 64-bit Neon vector
types. Radar 7457110.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91649 91177308-0d34-0410-b5e6-96231b3b80d8
This protects this test from depending on codegen not performing the
tail call optimization by default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91648 91177308-0d34-0410-b5e6-96231b3b80d8
to memcpy. (Such a memcpy is technically illegal, but in practice is safe
and is generated by struct self-assignment in C code.)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91621 91177308-0d34-0410-b5e6-96231b3b80d8
Fold (zext (and x, cst)) -> (and (zext x), cst)
DAG combiner likes to optimize expression in the other way so this would end up cause an infinite looping.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91574 91177308-0d34-0410-b5e6-96231b3b80d8
problem", this broke llvm-gcc bootstrap for release builds on
x86_64-apple-darwin10.
This reverts commit db22309800b224a9f5f51baf76071d7a93ce59c9.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91534 91177308-0d34-0410-b5e6-96231b3b80d8
in local register allocator. If a reg-reg copy has a phys reg
input and a virt reg output, and this is the last use of the phys
reg, assign the phys reg to the virt reg. If a reg-reg copy has
a phys reg output and we need to reload its spilled input, reload
it directly into the phys reg than passing it through another reg.
Following 76208, there is sometimes no dependency between the def of
a phys reg and its use; this creates a window where that phys reg
can be used for spilling (this is true in linear scan also). This
is bad and needs to be fixed a better way, although 76208 works too
well in practice to be reverted. However, there should normally be
no spilling within inline asm blocks. The patch here goes a long way
towards making this actually be true.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91485 91177308-0d34-0410-b5e6-96231b3b80d8
found last time. Instead of trying to modify the IR while iterating over it,
I've change it to keep a list of WeakVH references to dead instructions, and
then delete those instructions later. I also added some special case code to
detect and handle the situation when both operands of a memcpy intrinsic are
referencing the same alloca.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91459 91177308-0d34-0410-b5e6-96231b3b80d8
Checks that the code generated by 'tblgen --emit-llvmc' can be actually
compiled. Also fixes two bugs found in this way:
- forward_transformed_value didn't work with non-list arguments
- cl::ZeroOrOne is now called cl::Optional
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91404 91177308-0d34-0410-b5e6-96231b3b80d8
1. Only perform (zext (shl (zext x), y)) -> (shl (zext x), y) when y is a constant. This makes sure it remove at least one zest.
2. If the shift is a left shift, make sure the original shift cannot shift out bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91399 91177308-0d34-0410-b5e6-96231b3b80d8
While scanning through the uses of an alloca, keep track of the current offset
relative to the start of the alloca, and check memory references to see if
the offset & size correspond to a component within the alloca. This has the
nice benefit of unifying much of the code from isSafeUseOfAllocation,
isSafeElementUse, and isSafeUseOfBitCastedAllocation. The code to rewrite
the uses of a promoted alloca, after it is determined to be safe, is
reorganized in the same way.
Also, when rewriting GEP instructions, mark them as "in-bounds" since all the
indices are known to be safe.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91184 91177308-0d34-0410-b5e6-96231b3b80d8
value size. This only manifested when memdep inprecisely returns clobber,
which is do to a caching issue in the PR5744 testcase. We can 'efficiently
emulate' this by using '-no-aa'
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@91004 91177308-0d34-0410-b5e6-96231b3b80d8
clobbers to forward pieces of large stores to small loads, we need to consider
the properly phi translated pointer in the store block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90978 91177308-0d34-0410-b5e6-96231b3b80d8
add, there is no need to scan the world to find the same add again.
This invalidates the previous testcase, which wasn't wonderful anyway,
because it needed a run of instcombine to permute the use-lists in
just the right way to before GVN was run (so it was really fragile).
Not a big loss.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90973 91177308-0d34-0410-b5e6-96231b3b80d8
stores is not phi translating, thus it miscompiles really
crazy testcases. This is from inspection, I haven't seen
this in the wild.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90930 91177308-0d34-0410-b5e6-96231b3b80d8
phi translation of complex expressions like &A[i+1]. This has the
following benefits:
1. The phi translation logic is all contained in its own class with
a strong interface and verification that it is self consistent.
2. The logic is more correct than before. Previously, if intermediate
expressions got PHI translated, we'd miss the update and scan for
the wrong pointers in predecessor blocks. @phi_trans2 is a testcase
for this.
3. We have a lot less code in memdep.
We can handle phi translation across blocks of things like @phi_trans3,
which is pretty insane :).
This patch should fix the miscompiles of 255.vortex, and I tested it
with a bootstrap of llvm-gcc, llvm-test and dejagnu of course.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90926 91177308-0d34-0410-b5e6-96231b3b80d8
Don't print "SrcLine"; just print the filename and line number, which
is obvious enough and more informative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90631 91177308-0d34-0410-b5e6-96231b3b80d8
that I'm working on. This is manifesting as a miscompile of 255.vortex
on some targets. No check lines yet because it fails.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90520 91177308-0d34-0410-b5e6-96231b3b80d8
The coalescer is supposed to clean these up, but when setting up parameters
for a function call, there may be copies to physregs. If the defining
instruction has been LICM'ed far away, the coalescer won't touch it.
The register allocation hint does not always work - when the register
allocator is backtracking, it clears the hints.
This patch takes care of a few more cases that r90163 missed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90502 91177308-0d34-0410-b5e6-96231b3b80d8
Add a testcase for the above transformation.
Fix a bogus use of APInt noticed while tracking this down.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90423 91177308-0d34-0410-b5e6-96231b3b80d8
both source operands. In the canonical form, the 2nd operand is changed to an
undef and the shuffle mask is adjusted to only reference elements from the 1st
operand. Radar 7434842.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90417 91177308-0d34-0410-b5e6-96231b3b80d8
- A valno should be set HasRedefByEC if there is an early clobber def in the middle of its live ranges. It should not be set if the def of the valno is defined by an early clobber.
- If a physical register def is tied to an use and it's an early clobber, it just means the HasRedefByEC is set since it's still one continuous live range.
- Add a couple of missing checks for HasRedefByEC in the coalescer. In general, it should not coalesce a vr with a physical register if the physical register has a early clobber def somewhere. This is overly conservative but that's the price for using such a nasty inline asm "feature".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90269 91177308-0d34-0410-b5e6-96231b3b80d8
This means that well connected blocks are copy coalesced before the less connected blocks. Connected blocks are more difficult to
coalesce because intervals are more complicated, so handling them first gives a greater chance of success.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90194 91177308-0d34-0410-b5e6-96231b3b80d8
This helps us avoid silly copies when rematting values that are copied to a physical register:
leaq _.str44(%rip), %rcx
movq %rcx, %rsi
call _strcmp
becomes:
leaq _.str44(%rip), %rsi
call _strcmp
The coalescer will not touch the movq because that would tie down the physical register.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90163 91177308-0d34-0410-b5e6-96231b3b80d8
more. Update the syntax we're checking for and filecheckize it too.
This will fix the selfhost buildbots but will 'break' the others (sigh) because
they're still linked against older LLVM which is emitting less optimized IR.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90104 91177308-0d34-0410-b5e6-96231b3b80d8
handle cases like this:
void test(int N, double* G) {
long j;
for (j = 1; j < N - 1; j++)
G[j+1] = G[j] + G[j+1];
}
where G[1] isn't live into the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90041 91177308-0d34-0410-b5e6-96231b3b80d8
way that getUnderlyingObject does it.
This fixes the 'DecomposeGEPExpression and getUnderlyingObject disagree!'
assertion on sqlite3.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90038 91177308-0d34-0410-b5e6-96231b3b80d8
translation of add with immediate. This allows us
to optimize this function:
void test(int N, double* G) {
long j;
G[1] = 1;
for (j = 1; j < N - 1; j++)
G[j+1] = G[j] + G[j+1];
}
to only do one load every iteration of the loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90013 91177308-0d34-0410-b5e6-96231b3b80d8
array indexes. The "complex" case of SRoA still handles them, and correctly.
This fixes a weirdness where we'd correctly avoid transforming A[0][42] if
the 42 was too large, but we'd only do it if it was one gep, not two separate
ones.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90007 91177308-0d34-0410-b5e6-96231b3b80d8
the problem only shows for msp430 and pic16 which is why it specifies
them using -march. But it is wrong to put such tests in CodeGen/Generic,
since not everyone builds these targets. Put a copy of the test in each
of the target test directories.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@90005 91177308-0d34-0410-b5e6-96231b3b80d8
where it is not available. It's unclear how to get this inserted
computation into GVN's scalar availability sets, Owen, help? :)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89997 91177308-0d34-0410-b5e6-96231b3b80d8
generates store to undef and some generates store to null as the idiom
for undefined behavior. Since simplifycfg zaps both, don't remove the
undefined behavior in instcombine.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89971 91177308-0d34-0410-b5e6-96231b3b80d8
first expression as P+4+4*i which we considered to possibly alias
P+4*j. Now we correctly analyze the former one as P+1+4*i.
@test10 is a sanity test that verfies that we know that P+4+4*i != P+4*i.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89960 91177308-0d34-0410-b5e6-96231b3b80d8
This violates the ABI (that area is "reserved"), and
while it is safe if all code is generated with current
compilers, there is some very old code around that uses
that slot for something else, and breaks if it is stored
into. Adjust testcases looking for current behavior.
I've verified that the stack frame size is right in all
testcases, whether it changed or not. 7311323.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89811 91177308-0d34-0410-b5e6-96231b3b80d8
than doing the same via constpool:
1. Load from constpool costs 3 cycles on A9, movt/movw pair - just 2.
2. Load from constpool might stall up to 300 cycles due to cache miss.
3. Movt/movw does not use load/store unit.
4. Less constpool entries => better compiler performance.
This is only enabled on ELF systems, since darwin does not have needed
relocations (yet).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89720 91177308-0d34-0410-b5e6-96231b3b80d8
ConstantExpr, not just the top-level operator. This allows it to
fold many more constants.
Also, make GlobalOpt call ConstantFoldConstantExpression on
GlobalVariable initializers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89659 91177308-0d34-0410-b5e6-96231b3b80d8
values, resolving references to them, and then removing the definitions.
If a template argument is set to an undefined value, we need to resolve
references to that argument to an explicit undefined value. The current code
leaves the reference to the template argument as it is, which causes an
assertion failure later when the definition of the template argument is
removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@89581 91177308-0d34-0410-b5e6-96231b3b80d8