Commit Graph

2064 Commits

Author SHA1 Message Date
gbeauche
e07e2196e3 Use new code generator. The gain is only 10%, bottlenecks are elsewhere.
Optimize Altivec vector splat instructions after Agner's guide.
2006-07-17 06:56:38 +00:00
gbeauche
ceb43ce19a Define global XMM registers for SIMD & FPU (64-bit mode) 2006-07-17 06:52:13 +00:00
gbeauche
4e624209d3 Add new code generator for testing purposes (i386, x86_64) -- It's to be
used for mid-level optimizations
2006-07-17 06:49:07 +00:00
gbeauche
c306dfc4fd Make VSCR an uint32, don't bother splitting it into NJ, SAT values since
the gain is almost nil and actually hurts performance in JIT mode.
2006-07-17 06:46:56 +00:00
gbeauche
53f79caf8c Add LEALQmr, EMMS, SSE CMP and a series of new SSE opcodes (auto-generated) 2006-07-17 04:07:41 +00:00
gbeauche
cc12787047 Prepare for new code generator and mid-level optimizations. 2006-07-16 12:47:38 +00:00
gbeauche
7cfee5a2be Move processor capability information to utils-cpuinfo.[ch]hpp. Add new
utils-sentinel.hpp for helper functions to be called at program initialization
and termination.
2006-07-16 12:28:01 +00:00
gbeauche
9bc307c3fd Fix for new code generator -- FIXME: backend macros should be enabled only
in ppc-jit.cpp (e.g. define a new ENABLE_JIT_TARGET_ASM macro?)
2006-07-16 12:23:03 +00:00
gbeauche
6113436ea4 Remove obsolete code (HAVE_STATIC_DATA_EXEC). 2006-07-16 12:18:59 +00:00
gbeauche
a2e0cc10c0 forgot to commit this __op_PARAM? change 2006-07-16 12:09:40 +00:00
gbeauche
9e64c3af94 Add more SSE templates for new SheepShaver's code generator -- though it
should be made independent of this file.
2006-07-14 16:53:48 +00:00
gbeauche
b4768fc62c Run-time assembler fixes:
- Check for RIP register only in 64-bit mode
- Add missing macros and arguments (BT*im)
- MOVSWQ/MOVZWQ are 64-bit mode instructions only
2006-07-14 09:09:12 +00:00
gbeauche
5d7ef13a9c Fix gen_op_invoke*() for 64-bit offsets on x86-64. Drop CPUPARAM since it's
now cached to a host register.
2006-07-09 15:19:32 +00:00
gbeauche
a5296875f1 Optimize alignment routine for x86 & x86_64. 2006-07-09 15:18:08 +00:00
gbeauche
d75a91497d Fix debugging of generated code to include the block chainer trampoline. 2006-07-09 15:17:15 +00:00
gbeauche
7d5898f97a Some minor optimizations: xchg (unused), movdqa in sse2 code. 2006-07-09 12:19:50 +00:00
gbeauche
abc911eaa7 Remove use of global register A0 (now aliased to T0). This makes it possible
to cache the CPU context pointer to a register and thus rendering generated
code CPU context independent. Not useful to SheepShaver, but it is for
another project for threads emulation on plain x86-32.

Note: AltiVec performance may drop a little on x86 but this will be restored
(and even improved) in the future.
2006-07-09 12:15:48 +00:00
gbeauche
d0a64733ef Use -fno-align-functions to really disable function alignment (a value of 0
used the default alignment, e.g. 16 bytes on x86_64). This is purely cosmetics
and only helps reading the resulting disassembly.
2006-07-06 00:07:47 +00:00
gbeauche
d5bd143e40 Remove obsolete vminfp & vmaxfp (too long sequences) 2006-07-06 00:04:33 +00:00
gbeauche
c677dff47a Add more micro asm optimisations to x86{,-64} (mulhw, mulhwu, slw, srw, cntlzw
and subf* series). Also now enable the optimzations on x86_64 by default.
2006-07-06 00:01:04 +00:00
gbeauche
e39e80b44b cosmetics 2006-07-04 23:27:06 +00:00
gbeauche
0123552ddc Use extra precision (e.g. long double) for fma operations though this
inhibits some underflow conditions.
2006-07-04 23:23:42 +00:00
gbeauche
98dea63921 Fix fmadd et al. to set FPSCR[VXISI] only if any of the multiply operands
is an inifinity (2.1.5 -- don't set based on the intermediate result)
2006-07-04 23:20:46 +00:00
gbeauche
e020d63591 Fix frsp FPSCR[OX] condition 2006-07-04 23:17:37 +00:00
gbeauche
78952866b4 Fix mtfsb0 & mtfsb1 (VEX's xlc_dbl_u32 + code review) 2006-07-04 10:41:48 +00:00
gbeauche
c022ff87e6 remove dead code (fdivs was never used) 2006-07-04 08:58:40 +00:00
gbeauche
9f6edc436b Fix mismerge from kpx branch 2006-07-04 08:54:53 +00:00
gbeauche
7efab4276f Improve FPU emulation accurracy. However, PPC_ENABLE_FPU_EXCEPTIONS is still
set to 0 until generated code is optimized enough (current slow down factor
is 3x vs. previous core, expectations are about 50% slower FP code).

The main benefit is exception bits are accurate. All glibc test-fenv,
test-arith{,f}, test-double, test-float pass on ppc, and mostly on x86_64
with gcc 4.0.1. Yes, this is also compiler dependent.

FIXME: find a real Mac application that depends on precise FPSCR bits... I
think I don't want to care optimizing yet until someone shows me a real world
application.
2006-07-04 07:19:18 +00:00
gbeauche
0a74d0559a Fix fnmadds & fnmsubs emulation + try to provide optimized fma routines for
better precision
2006-07-04 07:06:18 +00:00
gbeauche
dc88ee271d Use lrint() for fctiw on x86-64. This is because some glibc use AMD optimized
math library where floor(), ceil() et al. don't set the inexact flag correctly
2006-07-04 06:59:28 +00:00
gbeauche
dc3df920c5 Fix fctiw emulation (VEX's jm-ppc-test -f, handle current rounding mode) 2006-07-04 06:58:24 +00:00
gbeauche
de5389bc0e Fix vminfp & vmaxfp emulation (VEX's jm-ppc-test -a, triggered nan bugs) 2006-07-04 04:47:04 +00:00
gbeauche
39ee6ba1aa Fix vctsxs & vctuxs emulation (VEX's jm-ppc-test -a, triggered inf/nan bugs) 2006-07-04 04:37:15 +00:00
gbeauche
0a2f9d3f03 Add fsel instruction emulation (VEX's jm-ppc-test -f) 2006-07-04 04:25:02 +00:00
gbeauche
635ee55a5d Fix floating-point single precision load/store (VEX's jm-ppc-test -f) 2006-07-04 04:21:02 +00:00
nigel
c071cd14f6 libgenemu can't find regflags in the XCode built newcpu.o,
so we compile it from the makefile into the lib, and not in the project
2006-05-25 05:03:03 +00:00
gbeauche
d9eb35f026 updates 2006-05-14 20:46:19 +00:00
gbeauche
e4f5757403 Merge from the QEMU tree:
- Fix IP packet re-assembly logic (Ed Swierk)
- Suppress unaligned accesses (Fabrice Bellard)
2006-05-14 17:27:38 +00:00
gbeauche
e339993b22 Updates. It's high time for a new snapshot. 2006-05-14 17:14:09 +00:00
gbeauche
c512377a12 Add 1GB item to GUI 2006-05-14 16:14:29 +00:00
gbeauche
ab1565ced2 Support up to 1GB in SheepShaver for Windows now. 2006-05-14 16:13:54 +00:00
gbeauche
bd6ec66354 Add missing "etherguid" prefs item for Basilisk II Ethernet support (b2ether). 2006-05-14 16:13:16 +00:00
gbeauche
3f3891ab31 Move up NATMEM_OFFSET to 0x11000000. This is arbitrarily determined to be
the base of the largest free block. Turns out SDL libraries are loaded around
0x10000000 so we have some luck here.
2006-05-14 15:58:11 +00:00
gbeauche
8a27c90e15 Temporary workaround for Windows (shndx_text is not unique...) 2006-05-14 14:12:57 +00:00
gbeauche
7660affe77 Windows apparently needs an extra mouse event to make the new cursor image
visible.
2006-05-14 14:11:46 +00:00
gbeauche
3aa9d912b7 Fix build on Windows (<malloc.h> for alloca()) 2006-05-14 13:49:53 +00:00
gbeauche
d26f8ad8e2 Fix for DIRECT_ADDRESSING mode (Windows) 2006-05-14 13:48:05 +00:00
nigel
1a7d3714fb Compile the CPU emulator in the makefile, so that it picks up configure-
generated #defines that are needed for running on X86 (vs PPC) emulator
2006-05-14 11:58:41 +00:00
nigel
c5c748f7e2 Some windowed graphics drawing methods cause the snapshot code to fail
if you have changed the depth since boot (seems to be something strange
with the parameters that I still haven't worked out). If this happens,
we now put a suggested workaround in the warning message.
2006-05-14 10:17:06 +00:00
gbeauche
7ef09b90bb NQD dirty boxes, BeOS backend -- no-op. 2006-05-14 08:35:35 +00:00