int test(int x) { return 32768 - x; }
Fixed by teaching the function that checks a constant's validity to be used
as an immediate argument about subtract-from instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17476 91177308-0d34-0410-b5e6-96231b3b80d8
This method is really a gross hack, but at least we can make it work on
the targets we support right now.
This bug fix stops a crash in a testcase reduced from 176.gcc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17443 91177308-0d34-0410-b5e6-96231b3b80d8
themselves. Make sure to update DSInfo correctly. This fixes a testcase
reduced from Prolangs-C++/objects
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17439 91177308-0d34-0410-b5e6-96231b3b80d8
function pointer equivalences. This fixes many problems, including a testcase
reduced Prolangs-C++/objects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17437 91177308-0d34-0410-b5e6-96231b3b80d8
a DSGraph at a time instead of a function at a time. This is also more
correct, though it doesn't seem to fix any programs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17435 91177308-0d34-0410-b5e6-96231b3b80d8
* *DO NOT* print CBU graphs when asked to print our own. This is just
FREAKING confusing and misleading: it's better to not print anything.
* Simplify and clean up some code
* Add some more paranoia assertion checking code that I found to track
down this bug:
* Fix a nasty bug that was causing us to crash on Prolangs-C++/objects,
where we were missing processing some graphs. This hunk is the bugfix:
- if (!I->isExternal() && !FoldedGraphsMap.count(I))
+ if (!I->isExternal() && !ValMap.count(I))
urg!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17386 91177308-0d34-0410-b5e6-96231b3b80d8
the CBU graphs, copy them instead of hacking on the CBU graphs.
Also, instead of forwarding request from ECGraphs clients to the CBU graphs
clients, service them ourselves.
Finally, remove a broken "optimization"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17378 91177308-0d34-0410-b5e6-96231b3b80d8
1. Calls to external global VARIABLES should not be treated as a call to an
external function
2. Efficiently deleting an element from a vector by using std::swap with
the back, then pop_back is NOT a good way to keep the vector sorted.
3. Our hope of having stuff get deleted by making them redundant just won't
work. In particular, if we have three calls in sequence that should be
merged: A, B, C first we unify B into A. To be sure that they appeared
identical (so B would be erased) we set B = A. On the next step, we
unified C into A and set C = A. Unfortunately, this is no guarantee that
C = B, so we would fail to delete the dead call. Switch to a more
explicit scheme.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17357 91177308-0d34-0410-b5e6-96231b3b80d8
* change some uses of NH.getNode() in a bool context to use !NH.isNull()
* Fix a bunch of places where we depended on the (undefined) order of
evaluation of arguments to function calls to ensure that getNode() was
called before getOffset(). In practice, this was NOT happening.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17354 91177308-0d34-0410-b5e6-96231b3b80d8
Fixed issue with generating the partial order. It now adds the nodes not in recurrences in sets for each connected component.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17351 91177308-0d34-0410-b5e6-96231b3b80d8
Correct the dependency of the Lexer.o file on the constructed
llvmAsmParser.h header file. It is not the Lexer.cpp file that depends on
the header, its the output of compiling Lexer.cpp, Lexer.o
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17289 91177308-0d34-0410-b5e6-96231b3b80d8
the strings for basic block labels in some cases. This amounted to about
120K of memory for namd, a medium sized program.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17262 91177308-0d34-0410-b5e6-96231b3b80d8
by the recently committed rlwimi.ll test file. Also commit initial code
for bitfield extract, although it is turned off until fully debugged.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17207 91177308-0d34-0410-b5e6-96231b3b80d8
* Convert register numbers from their opcode value to the real value, e.g.
PPC::R1 => 1 and PPC::F1 => 1
* Add correct handling of loading of global values which are PC-relative --
implement ha16() and lo16()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17190 91177308-0d34-0410-b5e6-96231b3b80d8
be listed second as that is how the instructions are usually created (and is the
correct asm syntax) so that it's assembled correctly from its constituents
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17183 91177308-0d34-0410-b5e6-96231b3b80d8
The decimal value given in the manual (8 or 9) really needs to be multiplied by
a factor of 32 because of the group of 5 zero bits after the register code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17182 91177308-0d34-0410-b5e6-96231b3b80d8
as the shift amount operand to a shift instruction. This was causing us to
emit unnecessary clear operations for code such as:
int foo(int x) { return 1 << x; }
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17175 91177308-0d34-0410-b5e6-96231b3b80d8
including registers, constants, and partial support for global addresses
* The JIT is disabled by default to allow building llvm-gcc, which wants to test
running programs during configure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17149 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of unconditionally copying all phi node values into temporaries for
all successor blocks, generate code that will determine what successor
block will be called and then copy only those phi node values needed by
the successor block.
This seems to cut down namd execution time from being 8% higher than GCC to
4% higher than GCC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17144 91177308-0d34-0410-b5e6-96231b3b80d8
- Support added for functions, basic blocks, constant pool, constants,
registers, and some basic support for globals, all untested
* Turn assert()s into abort()s so that unimplemented functions fail in release
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17143 91177308-0d34-0410-b5e6-96231b3b80d8
loops. This optimization is not turned on by default yet, but may be run
with the opt tool's -loop-reduce flag. There are many FIXMEs listed in the
code that will make it far more applicable to a wide range of code, but you
have to start somewhere :)
This limited version currently triggers on the following tests in the
MultiSource directory:
pcompress2: 7 times
cfrac: 5 times
anagram: 2 times
ks: 6 times
yacr2: 2 times
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17134 91177308-0d34-0410-b5e6-96231b3b80d8
Simplify code by simplifying terminators that branch to blocks that start
with an unreachable instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17116 91177308-0d34-0410-b5e6-96231b3b80d8
change hacks off 10K of bytecode from perlbmk (.5%) even though the front-end
is not generating them yet and we are not optimizing the resultant code.
This isn't too bad.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17111 91177308-0d34-0410-b5e6-96231b3b80d8
particular, invoke ret values are only live in the normal dest of the invoke
not in the unwind dest.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17108 91177308-0d34-0410-b5e6-96231b3b80d8
exercise that I'm not interested in tackling right now. Just punt and treat them
like unwind's.
This 'fixes' test/Regression/Transforms/ADCE/unreachable-function.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17106 91177308-0d34-0410-b5e6-96231b3b80d8
If a function had no return instruction in it, and the result of the inlined
call instruction was used, we would crash.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17104 91177308-0d34-0410-b5e6-96231b3b80d8
unneccesary. This allows us to delete several hundred phi nodes of the
form PHI(x,x,x,undef) from 253.perlbmk and probably other programs as well.
This implements Mem2Reg/UndefValuesMerge.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17098 91177308-0d34-0410-b5e6-96231b3b80d8
0->field, which is illegal. Now we print ((foo*)0)->field.
The second hunk is an optimization to not print undefined phi values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17094 91177308-0d34-0410-b5e6-96231b3b80d8
double %test(uint %X) {
%tmp.1 = cast uint %X to double ; <double> [#uses=1]
ret double %tmp.1
}
into:
test:
sub %ESP, 8
mov %EAX, DWORD PTR [%ESP + 12]
mov %ECX, 0
mov DWORD PTR [%ESP], %EAX
mov DWORD PTR [%ESP + 4], %ECX
fild QWORD PTR [%ESP]
add %ESP, 8
ret
... which basically zero extends to 8 bytes, then does an fild for an
8-byte signed int.
Now we generate this:
test:
sub %ESP, 4
mov %EAX, DWORD PTR [%ESP + 8]
mov DWORD PTR [%ESP], %EAX
fild DWORD PTR [%ESP]
shr %EAX, 31
fadd DWORD PTR [.CPItest_0 + 4*%EAX]
add %ESP, 4
ret
.section .rodata
.align 4
.CPItest_0:
.quad 5728578726015270912
This does a 32-bit signed integer load, then adds in an offset if the sign
bit of the integer was set.
It turns out that this is substantially faster than the preceeding sequence.
Consider this testcase:
unsigned a[2]={1,2};
volatile double G;
void main() {
int i;
for (i=0; i<100000000; ++i )
G += a[i&1];
}
On zion (a P4 Xeon, 3Ghz), this patch speeds up the testcase from 2.140s
to 0.94s.
On apoc, an athlon MP 2100+, this patch speeds up the testcase from 1.72s
to 1.34s.
Note that the program takes 2.5s/1.97s on zion/apoc with GCC 3.3 -O3
-fomit-frame-pointer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17083 91177308-0d34-0410-b5e6-96231b3b80d8
%X = and Y, constantint
%Z = setcc %X, 0
instead of emitting:
and %EAX, 3
test %EAX, %EAX
je .LBBfoo2_2 # UnifiedReturnBlock
We now emit:
test %EAX, 3
je .LBBfoo2_2 # UnifiedReturnBlock
This triggers 581 times on 176.gcc for example.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17080 91177308-0d34-0410-b5e6-96231b3b80d8
1. optional shift left
2. and x, immX
3. and y, immY
4. or z, x, y
==> rlwimi z, x, y, shift, mask begin, mask end
where immX == ~immY and immX is a run of set bits. This transformation
fires 32 times on voronoi, once on espresso, and probably several
dozen times on external benchmarks such as gcc.
To put this in terms of actual code generated for
struct B { unsigned a : 3; unsigned b : 2; };
void storeA (struct B *b, int v) { b->a = v;}
void storeB (struct B *b, int v) { b->b = v;}
Old:
_storeA:
rlwinm r2, r4, 0, 29, 31
lwz r4, 0(r3)
rlwinm r4, r4, 0, 0, 28
or r2, r4, r2
stw r2, 0(r3)
blr
_storeB:
rlwinm r2, r4, 3, 0, 28
rlwinm r2, r2, 0, 27, 28
lwz r4, 0(r3)
rlwinm r4, r4, 0, 29, 26
or r2, r2, r4
stw r2, 0(r3)
blr
New:
_storeA:
lwz r2, 0(r3)
rlwimi r2, r4, 0, 29, 31
stw r2, 0(r3)
blr
_storeB:
lwz r2, 0(r3)
rlwimi r2, r4, 3, 27, 28
stw r2, 0(r3)
blr
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@17078 91177308-0d34-0410-b5e6-96231b3b80d8