Commit Graph

685 Commits

Author SHA1 Message Date
gbeauche
9999881c78 Enable JIT in non-constructor so that a user-defined value can be set later 2007-01-21 13:44:27 +00:00
gbeauche
3b6a579f33 Optimize lwarx/stwcx for uniprocessors and generate code for them. There is
no performance increase even though those two instructions represented approx
18M of untranslated instructions on a simple boot to MacOS.
2007-01-18 07:02:35 +00:00
gbeauche
b9486d35e3 Rearrange powerpc_registers struct and nuke fp_result register which is
only needed for JIT (and to be handled differently in the future).
2007-01-17 07:05:19 +00:00
gbeauche
69d3fcba95 Update for new instr_info_t format 2007-01-17 06:56:09 +00:00
gbeauche
5b0b60da76 Remove specialised decoders. This will be done differently, if necessary. 2007-01-17 06:20:36 +00:00
gbeauche
e4af8a1909 Report SSSE3 instead of SSE4 (to be released later). 2007-01-15 07:00:16 +00:00
gbeauche
9dfecc4279 Update CPU table to kernel 2.6.17+ code (POWER6, Cell, PA6T). Fix detection
of the CPU string (separator is actually ','). Fix detection of CPU clock
frequency when it is expressed as a float.
2006-10-26 05:25:19 +00:00
gbeauche
954593d1c0 Generate spcflags checks at the start of the block. This makes better
opportunities when CR cache is implemented.
2006-07-30 16:29:10 +00:00
gbeauche
bcf7f9a2cd Add throw() specs for Linux glibc platforms 2006-07-30 09:49:21 +00:00
gbeauche
2c27914196 Fix op_record_cr6_VD() to use less branches (gcc 4.1.2 build fix on x86-32) 2006-07-19 22:21:46 +00:00
gbeauche
874bab017c Fix for parallel build (make -j20 here) 2006-07-19 06:00:26 +00:00
gbeauche
7705f85655 Add missing implementations for VAVGUB & VAVGUH. Optimize VSEL too. 2006-07-17 21:47:18 +00:00
gbeauche
07bf6fe6c1 Fix typo for ANDPS, ANDPD, ANDSS, ANDSD 2006-07-17 21:46:15 +00:00
gbeauche
a23f846bec symlink codegen_x86.h 2006-07-17 07:43:54 +00:00
gbeauche
c8a273332f Fix for 32-bit x86, was generating setcc CC,%dh instead of %dl.
i.e. force use of ecx & edx -- though it was fine in 64-bit mode, of course
2006-07-17 07:34:33 +00:00
gbeauche
e07e2196e3 Use new code generator. The gain is only 10%, bottlenecks are elsewhere.
Optimize Altivec vector splat instructions after Agner's guide.
2006-07-17 06:56:38 +00:00
gbeauche
ceb43ce19a Define global XMM registers for SIMD & FPU (64-bit mode) 2006-07-17 06:52:13 +00:00
gbeauche
4e624209d3 Add new code generator for testing purposes (i386, x86_64) -- It's to be
used for mid-level optimizations
2006-07-17 06:49:07 +00:00
gbeauche
c306dfc4fd Make VSCR an uint32, don't bother splitting it into NJ, SAT values since
the gain is almost nil and actually hurts performance in JIT mode.
2006-07-17 06:46:56 +00:00
gbeauche
cc12787047 Prepare for new code generator and mid-level optimizations. 2006-07-16 12:47:38 +00:00
gbeauche
7cfee5a2be Move processor capability information to utils-cpuinfo.[ch]hpp. Add new
utils-sentinel.hpp for helper functions to be called at program initialization
and termination.
2006-07-16 12:28:01 +00:00
gbeauche
9bc307c3fd Fix for new code generator -- FIXME: backend macros should be enabled only
in ppc-jit.cpp (e.g. define a new ENABLE_JIT_TARGET_ASM macro?)
2006-07-16 12:23:03 +00:00
gbeauche
6113436ea4 Remove obsolete code (HAVE_STATIC_DATA_EXEC). 2006-07-16 12:18:59 +00:00
gbeauche
a2e0cc10c0 forgot to commit this __op_PARAM? change 2006-07-16 12:09:40 +00:00
gbeauche
5d7ef13a9c Fix gen_op_invoke*() for 64-bit offsets on x86-64. Drop CPUPARAM since it's
now cached to a host register.
2006-07-09 15:19:32 +00:00
gbeauche
a5296875f1 Optimize alignment routine for x86 & x86_64. 2006-07-09 15:18:08 +00:00
gbeauche
d75a91497d Fix debugging of generated code to include the block chainer trampoline. 2006-07-09 15:17:15 +00:00
gbeauche
7d5898f97a Some minor optimizations: xchg (unused), movdqa in sse2 code. 2006-07-09 12:19:50 +00:00
gbeauche
abc911eaa7 Remove use of global register A0 (now aliased to T0). This makes it possible
to cache the CPU context pointer to a register and thus rendering generated
code CPU context independent. Not useful to SheepShaver, but it is for
another project for threads emulation on plain x86-32.

Note: AltiVec performance may drop a little on x86 but this will be restored
(and even improved) in the future.
2006-07-09 12:15:48 +00:00
gbeauche
d0a64733ef Use -fno-align-functions to really disable function alignment (a value of 0
used the default alignment, e.g. 16 bytes on x86_64). This is purely cosmetics
and only helps reading the resulting disassembly.
2006-07-06 00:07:47 +00:00
gbeauche
d5bd143e40 Remove obsolete vminfp & vmaxfp (too long sequences) 2006-07-06 00:04:33 +00:00
gbeauche
c677dff47a Add more micro asm optimisations to x86{,-64} (mulhw, mulhwu, slw, srw, cntlzw
and subf* series). Also now enable the optimzations on x86_64 by default.
2006-07-06 00:01:04 +00:00
gbeauche
e39e80b44b cosmetics 2006-07-04 23:27:06 +00:00
gbeauche
0123552ddc Use extra precision (e.g. long double) for fma operations though this
inhibits some underflow conditions.
2006-07-04 23:23:42 +00:00
gbeauche
98dea63921 Fix fmadd et al. to set FPSCR[VXISI] only if any of the multiply operands
is an inifinity (2.1.5 -- don't set based on the intermediate result)
2006-07-04 23:20:46 +00:00
gbeauche
e020d63591 Fix frsp FPSCR[OX] condition 2006-07-04 23:17:37 +00:00
gbeauche
78952866b4 Fix mtfsb0 & mtfsb1 (VEX's xlc_dbl_u32 + code review) 2006-07-04 10:41:48 +00:00
gbeauche
c022ff87e6 remove dead code (fdivs was never used) 2006-07-04 08:58:40 +00:00
gbeauche
9f6edc436b Fix mismerge from kpx branch 2006-07-04 08:54:53 +00:00
gbeauche
7efab4276f Improve FPU emulation accurracy. However, PPC_ENABLE_FPU_EXCEPTIONS is still
set to 0 until generated code is optimized enough (current slow down factor
is 3x vs. previous core, expectations are about 50% slower FP code).

The main benefit is exception bits are accurate. All glibc test-fenv,
test-arith{,f}, test-double, test-float pass on ppc, and mostly on x86_64
with gcc 4.0.1. Yes, this is also compiler dependent.

FIXME: find a real Mac application that depends on precise FPSCR bits... I
think I don't want to care optimizing yet until someone shows me a real world
application.
2006-07-04 07:19:18 +00:00
gbeauche
0a74d0559a Fix fnmadds & fnmsubs emulation + try to provide optimized fma routines for
better precision
2006-07-04 07:06:18 +00:00
gbeauche
dc88ee271d Use lrint() for fctiw on x86-64. This is because some glibc use AMD optimized
math library where floor(), ceil() et al. don't set the inexact flag correctly
2006-07-04 06:59:28 +00:00
gbeauche
dc3df920c5 Fix fctiw emulation (VEX's jm-ppc-test -f, handle current rounding mode) 2006-07-04 06:58:24 +00:00
gbeauche
de5389bc0e Fix vminfp & vmaxfp emulation (VEX's jm-ppc-test -a, triggered nan bugs) 2006-07-04 04:47:04 +00:00
gbeauche
39ee6ba1aa Fix vctsxs & vctuxs emulation (VEX's jm-ppc-test -a, triggered inf/nan bugs) 2006-07-04 04:37:15 +00:00
gbeauche
0a2f9d3f03 Add fsel instruction emulation (VEX's jm-ppc-test -f) 2006-07-04 04:25:02 +00:00
gbeauche
635ee55a5d Fix floating-point single precision load/store (VEX's jm-ppc-test -f) 2006-07-04 04:21:02 +00:00
gbeauche
d9eb35f026 updates 2006-05-14 20:46:19 +00:00
gbeauche
e339993b22 Updates. It's high time for a new snapshot. 2006-05-14 17:14:09 +00:00
gbeauche
c512377a12 Add 1GB item to GUI 2006-05-14 16:14:29 +00:00