asvitkine
be4aba2f82
Fix Master of Orion II
2007-01-21 17:21:23 +00:00
gbeauche
9999881c78
Enable JIT in non-constructor so that a user-defined value can be set later
2007-01-21 13:44:27 +00:00
gbeauche
3b6a579f33
Optimize lwarx/stwcx for uniprocessors and generate code for them. There is
...
no performance increase even though those two instructions represented approx
18M of untranslated instructions on a simple boot to MacOS.
2007-01-18 07:02:35 +00:00
gbeauche
b9486d35e3
Rearrange powerpc_registers struct and nuke fp_result register which is
...
only needed for JIT (and to be handled differently in the future).
2007-01-17 07:05:19 +00:00
gbeauche
69d3fcba95
Update for new instr_info_t format
2007-01-17 06:56:09 +00:00
gbeauche
5b0b60da76
Remove specialised decoders. This will be done differently, if necessary.
2007-01-17 06:20:36 +00:00
gbeauche
e4af8a1909
Report SSSE3 instead of SSE4 (to be released later).
2007-01-15 07:00:16 +00:00
gbeauche
9dfecc4279
Update CPU table to kernel 2.6.17+ code (POWER6, Cell, PA6T). Fix detection
...
of the CPU string (separator is actually ','). Fix detection of CPU clock
frequency when it is expressed as a float.
2006-10-26 05:25:19 +00:00
gbeauche
954593d1c0
Generate spcflags checks at the start of the block. This makes better
...
opportunities when CR cache is implemented.
2006-07-30 16:29:10 +00:00
gbeauche
bcf7f9a2cd
Add throw() specs for Linux glibc platforms
2006-07-30 09:49:21 +00:00
gbeauche
2c27914196
Fix op_record_cr6_VD() to use less branches (gcc 4.1.2 build fix on x86-32)
2006-07-19 22:21:46 +00:00
gbeauche
874bab017c
Fix for parallel build (make -j20 here)
2006-07-19 06:00:26 +00:00
gbeauche
7705f85655
Add missing implementations for VAVGUB & VAVGUH. Optimize VSEL too.
2006-07-17 21:47:18 +00:00
gbeauche
07bf6fe6c1
Fix typo for ANDPS, ANDPD, ANDSS, ANDSD
2006-07-17 21:46:15 +00:00
gbeauche
a23f846bec
symlink codegen_x86.h
2006-07-17 07:43:54 +00:00
gbeauche
c8a273332f
Fix for 32-bit x86, was generating setcc CC,%dh instead of %dl.
...
i.e. force use of ecx & edx -- though it was fine in 64-bit mode, of course
2006-07-17 07:34:33 +00:00
gbeauche
e07e2196e3
Use new code generator. The gain is only 10%, bottlenecks are elsewhere.
...
Optimize Altivec vector splat instructions after Agner's guide.
2006-07-17 06:56:38 +00:00
gbeauche
ceb43ce19a
Define global XMM registers for SIMD & FPU (64-bit mode)
2006-07-17 06:52:13 +00:00
gbeauche
4e624209d3
Add new code generator for testing purposes (i386, x86_64) -- It's to be
...
used for mid-level optimizations
2006-07-17 06:49:07 +00:00
gbeauche
c306dfc4fd
Make VSCR an uint32, don't bother splitting it into NJ, SAT values since
...
the gain is almost nil and actually hurts performance in JIT mode.
2006-07-17 06:46:56 +00:00
gbeauche
cc12787047
Prepare for new code generator and mid-level optimizations.
2006-07-16 12:47:38 +00:00
gbeauche
7cfee5a2be
Move processor capability information to utils-cpuinfo.[ch]hpp. Add new
...
utils-sentinel.hpp for helper functions to be called at program initialization
and termination.
2006-07-16 12:28:01 +00:00
gbeauche
9bc307c3fd
Fix for new code generator -- FIXME: backend macros should be enabled only
...
in ppc-jit.cpp (e.g. define a new ENABLE_JIT_TARGET_ASM macro?)
2006-07-16 12:23:03 +00:00
gbeauche
6113436ea4
Remove obsolete code (HAVE_STATIC_DATA_EXEC).
2006-07-16 12:18:59 +00:00
gbeauche
a2e0cc10c0
forgot to commit this __op_PARAM? change
2006-07-16 12:09:40 +00:00
gbeauche
5d7ef13a9c
Fix gen_op_invoke*() for 64-bit offsets on x86-64. Drop CPUPARAM since it's
...
now cached to a host register.
2006-07-09 15:19:32 +00:00
gbeauche
a5296875f1
Optimize alignment routine for x86 & x86_64.
2006-07-09 15:18:08 +00:00
gbeauche
d75a91497d
Fix debugging of generated code to include the block chainer trampoline.
2006-07-09 15:17:15 +00:00
gbeauche
7d5898f97a
Some minor optimizations: xchg (unused), movdqa in sse2 code.
2006-07-09 12:19:50 +00:00
gbeauche
abc911eaa7
Remove use of global register A0 (now aliased to T0). This makes it possible
...
to cache the CPU context pointer to a register and thus rendering generated
code CPU context independent. Not useful to SheepShaver, but it is for
another project for threads emulation on plain x86-32.
Note: AltiVec performance may drop a little on x86 but this will be restored
(and even improved) in the future.
2006-07-09 12:15:48 +00:00
gbeauche
d0a64733ef
Use -fno-align-functions to really disable function alignment (a value of 0
...
used the default alignment, e.g. 16 bytes on x86_64). This is purely cosmetics
and only helps reading the resulting disassembly.
2006-07-06 00:07:47 +00:00
gbeauche
d5bd143e40
Remove obsolete vminfp & vmaxfp (too long sequences)
2006-07-06 00:04:33 +00:00
gbeauche
c677dff47a
Add more micro asm optimisations to x86{,-64} (mulhw, mulhwu, slw, srw, cntlzw
...
and subf* series). Also now enable the optimzations on x86_64 by default.
2006-07-06 00:01:04 +00:00
gbeauche
e39e80b44b
cosmetics
2006-07-04 23:27:06 +00:00
gbeauche
0123552ddc
Use extra precision (e.g. long double) for fma operations though this
...
inhibits some underflow conditions.
2006-07-04 23:23:42 +00:00
gbeauche
98dea63921
Fix fmadd et al. to set FPSCR[VXISI] only if any of the multiply operands
...
is an inifinity (2.1.5 -- don't set based on the intermediate result)
2006-07-04 23:20:46 +00:00
gbeauche
e020d63591
Fix frsp FPSCR[OX] condition
2006-07-04 23:17:37 +00:00
gbeauche
78952866b4
Fix mtfsb0 & mtfsb1 (VEX's xlc_dbl_u32 + code review)
2006-07-04 10:41:48 +00:00
gbeauche
c022ff87e6
remove dead code (fdivs was never used)
2006-07-04 08:58:40 +00:00
gbeauche
9f6edc436b
Fix mismerge from kpx branch
2006-07-04 08:54:53 +00:00
gbeauche
7efab4276f
Improve FPU emulation accurracy. However, PPC_ENABLE_FPU_EXCEPTIONS is still
...
set to 0 until generated code is optimized enough (current slow down factor
is 3x vs. previous core, expectations are about 50% slower FP code).
The main benefit is exception bits are accurate. All glibc test-fenv,
test-arith{,f}, test-double, test-float pass on ppc, and mostly on x86_64
with gcc 4.0.1. Yes, this is also compiler dependent.
FIXME: find a real Mac application that depends on precise FPSCR bits... I
think I don't want to care optimizing yet until someone shows me a real world
application.
2006-07-04 07:19:18 +00:00
gbeauche
0a74d0559a
Fix fnmadds & fnmsubs emulation + try to provide optimized fma routines for
...
better precision
2006-07-04 07:06:18 +00:00
gbeauche
dc88ee271d
Use lrint() for fctiw on x86-64. This is because some glibc use AMD optimized
...
math library where floor(), ceil() et al. don't set the inexact flag correctly
2006-07-04 06:59:28 +00:00
gbeauche
dc3df920c5
Fix fctiw emulation (VEX's jm-ppc-test -f, handle current rounding mode)
2006-07-04 06:58:24 +00:00
gbeauche
de5389bc0e
Fix vminfp & vmaxfp emulation (VEX's jm-ppc-test -a, triggered nan bugs)
2006-07-04 04:47:04 +00:00
gbeauche
39ee6ba1aa
Fix vctsxs & vctuxs emulation (VEX's jm-ppc-test -a, triggered inf/nan bugs)
2006-07-04 04:37:15 +00:00
gbeauche
0a2f9d3f03
Add fsel instruction emulation (VEX's jm-ppc-test -f)
2006-07-04 04:25:02 +00:00
gbeauche
635ee55a5d
Fix floating-point single precision load/store (VEX's jm-ppc-test -f)
2006-07-04 04:21:02 +00:00
gbeauche
d9eb35f026
updates
2006-05-14 20:46:19 +00:00
gbeauche
e339993b22
Updates. It's high time for a new snapshot.
2006-05-14 17:14:09 +00:00