Commit Graph

49 Commits

Author SHA1 Message Date
Alexei Svitkine
b150b42fc6 Fix some string conversion warnings. 2017-12-10 11:27:08 -05:00
Alexei Svitkine
181634ab31 Fix more Xcode8 warnings and tweak project settings. 2016-12-17 23:31:03 -05:00
asvitkine
b3b5db5456 [Michael Schmitt]
Attached is a patch to SheepShaver to fix memory allocation problems when OS X 10.5 is the host. It also relaxes the 512 MB RAM limit on OS X hosts.


Problem
-------
Some users have been unable to run SheepShaver on OS X 10.5 (Leopard) hosts. The symptom is error "ERROR: Cannot map RAM: File already exists".

SheepShaver allocates RAM at fixed addresses. If it is running in "Real" addressing mode, and can't allocate at address 0, then it was hard-coded to allocate the RAM area at 0x20000000. The ROM area as allocated at 0x40800000.

The normal configuration is for SheepShaver to run under SDL, which is a Cocoa wrapper. By the time SheepShaver does its memory allocations, the Cocoa application has already started. The result is the SheepShaver memory address space already contains libraries, fonts, Input Managers, and IOKit areas.

On Leopard hosts these areas can land on the same addresses SheepShaver needs, so SheepShaver's memory allocation fails.


Solution
--------
The approach is to change SheepShaver (on Unix & OS X hosts) to allocate the RAM area anywhere it can find the space, rather than at a fixed address.

This could result in the RAM allocated higher than the ROM area, which causes a crash. To prevent this from occurring, the RAM and ROM areas are allocated contiguously.

Previously the ROM starting address was a constant ROM_BASE, which was used throughout the source files. The ROM start address is now a variable ROMBase. ROMBase is allocated and set by main_*.cpp just like RAMBase.

A side-effect of this change is that it lifts the 512 MB RAM limit for OS X hosts. The limit was because the fixed RAM and ROM addresses were such that the RAM could only be 512 MB before it overlapped the ROM area.


Impact
------
The change to make ROMBase a variable is throughout all hosts & addressing modes.

The RAM and ROM areas will only shift when run on Unix & OS X hosts, otherwise the same fixed allocation address is used as before.

This change is limited to "Real" addressing mode. Unlike Basilisk II, SheepShaver *pre-calculates* the offset for "Direct" addressing mode; the offset is compiled into the program. If the RAM address were allowed to shift, it could result in the RAM area wrapping around address 0.


Changes to main_unix.cpp
------------------------
1. Real addressing mode no longer defines a RAM_BASE constant.

2. The base address of the Mac ROM (ROMBase) is defined and exported by this program.

3. Memory management helper vm_mac_acquire is renamed to vm_mac_acquire_fixed. Added a new memory management helper vm_mac_acquire, which allocates memory at any address.

4. Changed and rearranged the allocation of RAM and ROM areas.

Before it worked like this:

  - Allocate ROM area
  - If can, attempt to allocate RAM at address zero
  - If RAM not allocated at 0, allocate at fixed address

We still want to try allocating the RAM at zero, and if using DIRECT addressing we're still going to use the fixed addresses. So we don't know where the ROM should be until after we do the RAM. The new logic is:

  - If can, attempt to allocate RAM at address zero
  - If RAM not allocated at 0
      if REAL addressing
         allocate RAM and ROM together. The ROM address is aligned to a 1 MB boundary
      else (direct addressing)
         allocate RAM at fixed address
  - If ROM hasn't been allocated yet, allocate at fixed address

5. Calculate ROMBase and ROMBaseHost based on where the ROM was loaded.

6. There is a crash if the RAM is allocated too high. To try and catch this, check if it was allocated higher than the kernel data address.

7. Change subsequent code from using constant ROM_BASE to variable ROMBase.


Changes to Other Programs
-------------------------
emul_op.cpp, main.cpp, name_registery.cpp, rom_patches.cpp, rsrc_patches.cpp, emul_ppc.cpp, sheepshaver_glue.cpp, ppc-translate-cpp:
Change from constant ROM_BASE to variable ROMBase.

ppc_asm.S: It was setting register to a hard-coded literal address: 0x40b0d000. Changed to set it to ROMBase + 0x30d000.

ppc_asm.tmpl: It defined a macro ASM_LO16 but it assumed that the macro would always be used with operands that included a register specification. This is not true. Moved the register specification from the macro to the macro invocations.

main_beos.cpp, main_windows.cpp: Since the subprograms are all expecting a variable ROMBase, all the main_*.cpp pgrams have to define and export it. The ROM_BASE constant is moved here for consistency. The mains for beos and windows just allocate the ROM at the same fixed address as before, set ROMBaseHost and ROMBase to that address, and then use ROMBase for the subsequent code.

cpu_emulation.h: removed ROM_BASE constant. This value is moved to the main_*.cpp modules, to be consistent with RAM_BASE.

user_strings_unix.cpp, user_strings_unix.h: Added new error messages related to errors that occur when the RAM and ROM are allocated anywhere.
2009-08-18 18:26:11 +00:00
asvitkine
d6db773362 [patch from Darik Horn <dajhorn@vanadac.com> ]
Makes SheepShaver compatible with Ubuntu Intrepid and
other distros that bundle the gcc-4.3 compiler.

The patch changes two things:

1. Renames the block_cache where its name collides with its class
definition.

2. Fixes the "explicit template specialization cannot have a storage
class" error in the ppc-dyngen-ops.cpp file.
2009-01-15 23:25:08 +00:00
gbeauche
b5746b4f68 Add SSSE3 optimizations (Intel Core 2 CPUs and newer) for LVX, STVX, VPERM.
This brings an overall +10% performance improvement in AltiVec Fractal Carbon.
2008-01-01 21:51:56 +00:00
gbeauche
8e0088c4c6 generate lwarx/swcx. native code only for uniprocessor emulation 2007-02-17 08:59:56 +00:00
gbeauche
3b6a579f33 Optimize lwarx/stwcx for uniprocessors and generate code for them. There is
no performance increase even though those two instructions represented approx
18M of untranslated instructions on a simple boot to MacOS.
2007-01-18 07:02:35 +00:00
gbeauche
954593d1c0 Generate spcflags checks at the start of the block. This makes better
opportunities when CR cache is implemented.
2006-07-30 16:29:10 +00:00
gbeauche
7705f85655 Add missing implementations for VAVGUB & VAVGUH. Optimize VSEL too. 2006-07-17 21:47:18 +00:00
gbeauche
e07e2196e3 Use new code generator. The gain is only 10%, bottlenecks are elsewhere.
Optimize Altivec vector splat instructions after Agner's guide.
2006-07-17 06:56:38 +00:00
gbeauche
cc12787047 Prepare for new code generator and mid-level optimizations. 2006-07-16 12:47:38 +00:00
gbeauche
d75a91497d Fix debugging of generated code to include the block chainer trampoline. 2006-07-09 15:17:15 +00:00
gbeauche
abc911eaa7 Remove use of global register A0 (now aliased to T0). This makes it possible
to cache the CPU context pointer to a register and thus rendering generated
code CPU context independent. Not useful to SheepShaver, but it is for
another project for threads emulation on plain x86-32.

Note: AltiVec performance may drop a little on x86 but this will be restored
(and even improved) in the future.
2006-07-09 12:15:48 +00:00
gbeauche
de5389bc0e Fix vminfp & vmaxfp emulation (VEX's jm-ppc-test -a, triggered nan bugs) 2006-07-04 04:47:04 +00:00
gbeauche
357467c97b <cpu/jit/dyngen-exec.h> is necessary because it contains the definitions of
DYNGEN_FAST_DISPATCH for the direct_chaining_possible() test... [otherwise,
it was a performance regression]
2006-01-18 23:45:31 +00:00
gbeauche
0a9a210c30 Disable direct block chaining if DYNGEN_FAST_DISPATCH is not defined. Note
this is a workaround prior to enabling it on mips and the future JIT.
2005-12-12 21:44:50 +00:00
gbeauche
0db0d48bf0 Avoid the use of floating-point when loading/storing from/to memory. This
could have caused some rounding thus alterations to integer registers on
context switches when lfd/stfd instructions were used. e.g. cygwin compilers
defaulted to i686 code generation and exhibed this behaviour, you could also
see this behavior with -march=i586 -mtune=pentiumpro. GCC is perfectly right
to do those optimizations.
2005-03-20 23:07:11 +00:00
gbeauche
8071d90849 workaround weird bug lying somewhere in cygwin generated micro-ops for
FP load/store of doubles
2005-03-19 07:18:18 +00:00
gbeauche
05a7453d54 MMX/SSE/SSE2 optimizations are now converted to full inline assembly code,
aka avoid use of (possibly broken) GCC intrinsics. Add some SSE2 optimizations.
Translate VSLDOI, MFVSCR, MTVSCR instructions. AltiVec Fractal Carbon now
shows more than 1 GFlops performance!
2005-03-13 12:49:30 +00:00
gbeauche
df0d5d2a41 Happy New Year 2005! 2005-01-30 21:48:22 +00:00
gbeauche
72b26d7ff7 Stop forced compilation when entering a new JIT execution level. 2004-06-15 21:27:46 +00:00
gbeauche
f574a5df05 Cleanups. Rewrite gen_bc() so that no push/pop could be inserted thus
causing crahes with some compilers. However, that's slower.
2004-06-09 16:36:44 +00:00
gbeauche
619aa9b319 Translate LMW, STMW and DCBZ instructions. 2004-05-23 16:34:38 +00:00
gbeauche
b0aae35951 Do FOLLOW_CONST_JUMPS for bcl 20,BI,TARGET branches too, since that's an
unconditional jump and we don't need the LR in that case.

Also fix this:
SheepShaver: ../kpx_cpu/src/cpu/ppc/ppc-translate.cpp:1499: powerpc_block_info* powerpc_cpu::compile_block(unsigned int): Assertion `dg.jmp_addr[i] != __null' failed.
Aborted

aka. StuffIt Expander + pressing the 'Cancel' button.
2004-05-23 06:41:25 +00:00
gbeauche
f376933138 Attempt to fix direct block chaining code in corner cases. e.g. really
chain only blocks within page boundaries (compare against block entry point)
2004-05-22 17:57:36 +00:00
gbeauche
c3f2342f47 Make NativeOp() handler a sheepshaver_cpu handler, thus getting rid of ugly
GPR macro definition.

Make the JIT engine somewhat reentrant. This brings a massive performance
boost for applications that cause many Execute68k(). e.g. audio in PlayerPRO.
2004-05-19 21:23:17 +00:00
gbeauche
81ae2fee40 Direct block chaining on x86 and amd64 too. Optimize do_execute_branch_bo<>
No need to update Program Counter if we have direct linked blocks.

TODO: remove obsolete PC-related generators
2004-05-12 10:44:04 +00:00
gbeauche
15a0779328 Size optimization: don't generate jump_next_A0() code in block chaining
mode since the only case we would reach that is when there are pending
interrupts, thus needing to exit from this basic block ASAP. Otherwise,
we jumped to linker trampolines
2004-05-11 21:53:48 +00:00
gbeauche
08bcd2653d direct block chaining, aka faster block dispatcher 2004-05-11 20:53:25 +00:00
gbeauche
443231c1da First round of SSE/MMX optimizations & experimentations. AltiVec Fractal
Carbon performance increased by a factor 8 (420 MegaFlops).
2004-02-20 17:18:44 +00:00
gbeauche
ea3c6801ab Experiment with generic AltiVec optimizations for V4SF, V2DI operands (+60%) 2004-02-16 23:17:27 +00:00
gbeauche
d10a3586f1 Year got increased "recently". ;-) 2004-02-16 10:57:07 +00:00
gbeauche
313cddeeb2 AltiVec emulation! ;-) 2004-02-15 17:17:37 +00:00
gbeauche
8afa65cc96 Inline fast basic block lookups. Only check top tag as it is a hit more than
95% of the time. Overall, this improves performance by more than 2x on a P4.
2004-01-27 13:54:51 +00:00
gbeauche
ea9553ee65 Optimize rlwinm further. Translate FP instructions if we don't need to
compute exceptions.
2004-01-25 23:21:06 +00:00
gbeauche
3de5a15902 Don't define disasm_block() in non-JIT mode. Also make sure to disassemble
native code if we can (i.e. TARGET_NATIVE disassembler exists).
2004-01-24 11:52:54 +00:00
gbeauche
60d371486b Propagate done_compile down to compile1() in case it needs to override
the end-of-block condition (e.g. sheep EmulOps)
2004-01-24 11:22:48 +00:00
gbeauche
09cd7ccfd6 gcc on darwin defines __ppc__, not __powerpc__ 2004-01-14 23:16:37 +00:00
gbeauche
5dca41d253 Add gen_invoke_CPU_im_im() to invoke do_record_step(pc, opcode). 2003-12-04 17:53:04 +00:00
gbeauche
04214f3820 Fix decrement the CTR, then branch conditional if decremented CTR != 0.
Remove CR cache for now. Remove BC & MODE_68K hacks for SheepShaver,
that was a colateral damage of wrong branch emulation of the former.
2003-12-02 22:49:18 +00:00
gbeauche
e2ca6270f8 Implement ISYNC, MTCRF, MCRF. 2003-12-01 13:40:38 +00:00
gbeauche
054748532a NOP'ize unimplemented instructions 2003-12-01 13:21:41 +00:00
gbeauche
dd956c78db gather some stats on untranslated instructions 2003-12-01 13:07:26 +00:00
gbeauche
f034ae704f handle ROM areas and put associated blocks into dormant state 2003-12-01 00:16:21 +00:00
gbeauche
ceb9b4a428 cleanups & optimize for constant branches (i.e. follow them). 2003-12-01 00:03:02 +00:00
gbeauche
7594e26d36 fix new block creation on full cache that was just invalidated, add
provisions for following constants jumps in next commit.
2003-11-30 17:16:24 +00:00
gbeauche
2bacb2fd01 Workaround CR expectations in MODE_68K execution 2003-11-27 11:06:23 +00:00
gbeauche
e30001bc00 Fix BCCTR & BCLR. However, conditions are still wrong somehow, disabled
this case. Factored & optimized branch instructions.
2003-11-26 23:58:14 +00:00
gbeauche
73d51962f6 Merge in-progress PowerPC "JIT1" engine for AMD64, IA-32, PPC.
The merge probably got wrong as there are some problems probably due to the
experiment begining with CR deferred evaluation. With nbench/ppc, performance
improvement was around 2x. With nbench on x86, performance improvement was
around 4x on average.

Incompatible change: instr_info_t has a new field in the middle. But since
insertion of PPC_I(XXX) identifiers is auto-generated, there is no problem.
2003-11-24 23:45:52 +00:00