Commit Graph

1874 Commits

Author SHA1 Message Date
gbeauche
7af6665619 icc9.1 & gcc4.1 warning fixes 2006-07-23 10:20:23 +00:00
gbeauche
2c27914196 Fix op_record_cr6_VD() to use less branches (gcc 4.1.2 build fix on x86-32) 2006-07-19 22:21:46 +00:00
gbeauche
87e1518e96 A few fixlets to the SIGSEGV library:
- Don't export transfer types definitions (formerly used by older API)
- Handle ADD instructions in ix86_skip_instruction() (generated by icc 9.1)
- Use "%p" format for EIP/RIP addresses
2006-07-19 21:31:10 +00:00
gbeauche
d1d7d5bd4c Fix remove_shm_range() to actually return something 2006-07-19 21:23:41 +00:00
gbeauche
f39c252fbd Fix 33-bit addressing mode check when compiling with icc 9.1 2006-07-19 21:22:41 +00:00
gbeauche
874bab017c Fix for parallel build (make -j20 here) 2006-07-19 06:00:26 +00:00
gbeauche
7705f85655 Add missing implementations for VAVGUB & VAVGUH. Optimize VSEL too. 2006-07-17 21:47:18 +00:00
gbeauche
07bf6fe6c1 Fix typo for ANDPS, ANDPD, ANDSS, ANDSD 2006-07-17 21:46:15 +00:00
gbeauche
a23f846bec symlink codegen_x86.h 2006-07-17 07:43:54 +00:00
gbeauche
c8a273332f Fix for 32-bit x86, was generating setcc CC,%dh instead of %dl.
i.e. force use of ecx & edx -- though it was fine in 64-bit mode, of course
2006-07-17 07:34:33 +00:00
gbeauche
e07e2196e3 Use new code generator. The gain is only 10%, bottlenecks are elsewhere.
Optimize Altivec vector splat instructions after Agner's guide.
2006-07-17 06:56:38 +00:00
gbeauche
ceb43ce19a Define global XMM registers for SIMD & FPU (64-bit mode) 2006-07-17 06:52:13 +00:00
gbeauche
4e624209d3 Add new code generator for testing purposes (i386, x86_64) -- It's to be
used for mid-level optimizations
2006-07-17 06:49:07 +00:00
gbeauche
c306dfc4fd Make VSCR an uint32, don't bother splitting it into NJ, SAT values since
the gain is almost nil and actually hurts performance in JIT mode.
2006-07-17 06:46:56 +00:00
gbeauche
53f79caf8c Add LEALQmr, EMMS, SSE CMP and a series of new SSE opcodes (auto-generated) 2006-07-17 04:07:41 +00:00
gbeauche
cc12787047 Prepare for new code generator and mid-level optimizations. 2006-07-16 12:47:38 +00:00
gbeauche
7cfee5a2be Move processor capability information to utils-cpuinfo.[ch]hpp. Add new
utils-sentinel.hpp for helper functions to be called at program initialization
and termination.
2006-07-16 12:28:01 +00:00
gbeauche
9bc307c3fd Fix for new code generator -- FIXME: backend macros should be enabled only
in ppc-jit.cpp (e.g. define a new ENABLE_JIT_TARGET_ASM macro?)
2006-07-16 12:23:03 +00:00
gbeauche
6113436ea4 Remove obsolete code (HAVE_STATIC_DATA_EXEC). 2006-07-16 12:18:59 +00:00
gbeauche
a2e0cc10c0 forgot to commit this __op_PARAM? change 2006-07-16 12:09:40 +00:00
gbeauche
9e64c3af94 Add more SSE templates for new SheepShaver's code generator -- though it
should be made independent of this file.
2006-07-14 16:53:48 +00:00
gbeauche
b4768fc62c Run-time assembler fixes:
- Check for RIP register only in 64-bit mode
- Add missing macros and arguments (BT*im)
- MOVSWQ/MOVZWQ are 64-bit mode instructions only
2006-07-14 09:09:12 +00:00
gbeauche
5d7ef13a9c Fix gen_op_invoke*() for 64-bit offsets on x86-64. Drop CPUPARAM since it's
now cached to a host register.
2006-07-09 15:19:32 +00:00
gbeauche
a5296875f1 Optimize alignment routine for x86 & x86_64. 2006-07-09 15:18:08 +00:00
gbeauche
d75a91497d Fix debugging of generated code to include the block chainer trampoline. 2006-07-09 15:17:15 +00:00
gbeauche
7d5898f97a Some minor optimizations: xchg (unused), movdqa in sse2 code. 2006-07-09 12:19:50 +00:00
gbeauche
abc911eaa7 Remove use of global register A0 (now aliased to T0). This makes it possible
to cache the CPU context pointer to a register and thus rendering generated
code CPU context independent. Not useful to SheepShaver, but it is for
another project for threads emulation on plain x86-32.

Note: AltiVec performance may drop a little on x86 but this will be restored
(and even improved) in the future.
2006-07-09 12:15:48 +00:00
gbeauche
d0a64733ef Use -fno-align-functions to really disable function alignment (a value of 0
used the default alignment, e.g. 16 bytes on x86_64). This is purely cosmetics
and only helps reading the resulting disassembly.
2006-07-06 00:07:47 +00:00
gbeauche
d5bd143e40 Remove obsolete vminfp & vmaxfp (too long sequences) 2006-07-06 00:04:33 +00:00
gbeauche
c677dff47a Add more micro asm optimisations to x86{,-64} (mulhw, mulhwu, slw, srw, cntlzw
and subf* series). Also now enable the optimzations on x86_64 by default.
2006-07-06 00:01:04 +00:00
gbeauche
e39e80b44b cosmetics 2006-07-04 23:27:06 +00:00
gbeauche
0123552ddc Use extra precision (e.g. long double) for fma operations though this
inhibits some underflow conditions.
2006-07-04 23:23:42 +00:00
gbeauche
98dea63921 Fix fmadd et al. to set FPSCR[VXISI] only if any of the multiply operands
is an inifinity (2.1.5 -- don't set based on the intermediate result)
2006-07-04 23:20:46 +00:00
gbeauche
e020d63591 Fix frsp FPSCR[OX] condition 2006-07-04 23:17:37 +00:00
gbeauche
78952866b4 Fix mtfsb0 & mtfsb1 (VEX's xlc_dbl_u32 + code review) 2006-07-04 10:41:48 +00:00
gbeauche
c022ff87e6 remove dead code (fdivs was never used) 2006-07-04 08:58:40 +00:00
gbeauche
9f6edc436b Fix mismerge from kpx branch 2006-07-04 08:54:53 +00:00
gbeauche
7efab4276f Improve FPU emulation accurracy. However, PPC_ENABLE_FPU_EXCEPTIONS is still
set to 0 until generated code is optimized enough (current slow down factor
is 3x vs. previous core, expectations are about 50% slower FP code).

The main benefit is exception bits are accurate. All glibc test-fenv,
test-arith{,f}, test-double, test-float pass on ppc, and mostly on x86_64
with gcc 4.0.1. Yes, this is also compiler dependent.

FIXME: find a real Mac application that depends on precise FPSCR bits... I
think I don't want to care optimizing yet until someone shows me a real world
application.
2006-07-04 07:19:18 +00:00
gbeauche
0a74d0559a Fix fnmadds & fnmsubs emulation + try to provide optimized fma routines for
better precision
2006-07-04 07:06:18 +00:00
gbeauche
dc88ee271d Use lrint() for fctiw on x86-64. This is because some glibc use AMD optimized
math library where floor(), ceil() et al. don't set the inexact flag correctly
2006-07-04 06:59:28 +00:00
gbeauche
dc3df920c5 Fix fctiw emulation (VEX's jm-ppc-test -f, handle current rounding mode) 2006-07-04 06:58:24 +00:00
gbeauche
de5389bc0e Fix vminfp & vmaxfp emulation (VEX's jm-ppc-test -a, triggered nan bugs) 2006-07-04 04:47:04 +00:00
gbeauche
39ee6ba1aa Fix vctsxs & vctuxs emulation (VEX's jm-ppc-test -a, triggered inf/nan bugs) 2006-07-04 04:37:15 +00:00
gbeauche
0a2f9d3f03 Add fsel instruction emulation (VEX's jm-ppc-test -f) 2006-07-04 04:25:02 +00:00
gbeauche
635ee55a5d Fix floating-point single precision load/store (VEX's jm-ppc-test -f) 2006-07-04 04:21:02 +00:00
nigel
c071cd14f6 libgenemu can't find regflags in the XCode built newcpu.o,
so we compile it from the makefile into the lib, and not in the project
2006-05-25 05:03:03 +00:00
gbeauche
d9eb35f026 updates 2006-05-14 20:46:19 +00:00
gbeauche
e4f5757403 Merge from the QEMU tree:
- Fix IP packet re-assembly logic (Ed Swierk)
- Suppress unaligned accesses (Fabrice Bellard)
2006-05-14 17:27:38 +00:00
gbeauche
e339993b22 Updates. It's high time for a new snapshot. 2006-05-14 17:14:09 +00:00
gbeauche
c512377a12 Add 1GB item to GUI 2006-05-14 16:14:29 +00:00