Last use of grab_return was removed in f204caa9079aa94d90e1a8ef650b845283c1d46a.
grab_breakpoint was added in 2bd717e2931cba5be3152f92b3cca5e82e446759 but
never used.
There's no reason for it to be a global, we always set it and use it
in instruction implementations, and we never read it directly.
Perhaps the compiler could optimize this away, but it's better to be
simpler (and also be easier to read).
Mode 1 contains real addressing mode entries, which by definition cannot
be using segment registers. By skipping over them, we can shave off a
couple of seconds from the 10.2 boot time.
They happen surprisingly often, and flushing the TLB is expensive
because we need to walk over all entries.
Takes booting 10.2 on a Beige G3 from binary start to "Welcome to Macintosh"
from 58s to 38s on my machine.
Keeps track of instructions (including operands) that are executed,
to see if there are any hotspots that could be optimized or fastpaths
that should be added.
Also adds a mode where CPU profiler data is periodically output, to
make it easier to get at these instruction counts during startup.
cur_dma_rgn->end is the last byte of a region. It is not the byte after the region. Therefore, subtract 1 from size before doing compare.
Also add more detail to the abort messages.
None of the POWER opcodes uses it now, plus it is a duplicate of ppc_setsoov (though ppc_setsoov is inline so it would have to be moved to be able to use it in poweropcodes.cpp?
Use U instead of UL. U will use the smallest size that can fit all the unsigned bytes. Since 0xFFFFFFFF fits in 32 bits, the 0xFFFFFFFFU is a uint32_t.
Including bits of rot_sh in the rA and MQ calculations is nonsensical since it is a rotation count and not a source of bits to be extracted or rotated.
The mask is not complicated, so we don't need to use power_rot_mask.
Fix carry flag calculation. Anding with the rotation count (n = rB) is nonsensical.
(r & ~mask) is the rotated word ANDed with the complement of the generated mask of n zeros followed by 32 - n ones.
The manual says this 32-bit result is ORed together. This means all the bits are ORed together which is equivalent to saying 0 if all zeros and 1 if any ones. In other words: (r & ~mask) != 0.
This boolean is ANDed with bit 0 of rS to produce the carry. int32_t(rS) < 0 will test bit 0. The && operator will treat each side as a boolean so you can exclude "!= 0" tests.
If bit 26 of rB is set then the mask should be all ones.
If bit 26 of rB is set then rA should be all ones or all zeros (depending on the sign bit of rA).
Test bit 26 of rB instead of using >= 0x20 to determine which operation to perform.
The two operations need to be switched such that rA is cleared when bit 26 is set.
Don't forget to store the result in rA.
Test bit 26 of rB instead of using >= 0x20 to determine which operation to perform.
Since the mask is not complicated, we don't need to use power_rot_mask.
It is redundant to test bit 0 of rS and then use bit 0 of rS in the case when bit 0 of rS is set.
In the case when bit 0 of rS is not set, using bit 0 or rS is incorrect since it results in no change of rA.
Operands are supposed to be twos complement numbers.
Calculate overflow first before calculating condition codes because the overflow condition is copied from XER.
Fix OV calculation. Previously, it was using power_setsoov which I think is only for add and subtract operations.
Fix CR calcalation. It's supposed to depend on the low order 32 bits that are placed into MQ.
- Fix CR calculation. It depends on whether a match occurred and only the EQ flag is affected.
- Remove bytes_copied. We can subtract bytes_remaining from bytes_to_load to calculate that.
- Initialize ppc_result_d to zero so that bitmask is not needed to add new bytes to it. This is ok since the manual says that bytes that are not loaded are undefined.
Calculate overflow first before calculating condition codes because the overflow condition is copied from XER.
Fix OV calculation. Previously, it was using power_setsoov which I think is only for add and subtract operations. doz does a subtract but only if the result is supposed to be positive, therefore a negative result indicates an overflow.
dividend and divisor are supposed to be a twos compliment numbers.
Fix OV calculation. Previously, it was using power_setsoov which I think is only for add and subtract operations.
Fix CR calculation. It depends on the remainder, not the quotient.
dividend is supposed to be a twos compliment number.
Fix test for dividend = -0x80000000 and divisor = -1. Previously, the test was assuming dividend was a 32-bit value from rA.
Fix OV calculation. Previously, it was using power_setsoov which I think is only for add and subtract operations.
Fix CR calculation. It depends on the remainder, not the quotient.
For MPC601 CPUs, all values of rA return 64 though the manual says undefined values of rA produce undefined results.
For non-MPC601 CPUs, if this instruction is included (such as for risu DPPC) then return results that are obtained from a G4 running Mac OS 9.2.2.
Making a negative value positive requires unary negate operator rather than binary and operator since negative numbers are stored using twos compliment.
If ov is set then clear overflow when overflow doesn't happen.
- Rename DEC to DEC_S and add DEC_U.
- MQ, RTCL_U, RTCU_U, and DEC_U should cause an illegal instruction program exception for non-MPC601 CPUs. The exception handler of classic Mac OS uses this to emulate the instruction.
- For mtspr, the SPRs RTCL_U, RTCU_U, and DEC_U are treated as no-op on MPC601.
- For debugging, use the supervisor instead of the user SPR number as the index for storing the values for RTC, TB, and DEC.
- For debugging, RTC, TB, and DEC should be updated after each access. Previously, mfspr and mtspr would only update the half of RTC and TB that was being accessed instead of both halves.
Accessing an SPR with bit 4 set (> 15) requires supervisor privilege and should cause a supervisor-level instruction exception (privileged instruction type program exception).
The first option is a flag that enables MPC601 (POWER) instructions for CPUs that are not MPC601.
This can be useful for the following reasons:
1) To produce results similar to classic Mac OS which emulates MPC601 instructions on CPUs that don't implement MPC601 instructions. This option is used to compare the risu traces produced in Mac OS 9 on a G3 or G4 with DPPC.
2) May increase performance in apps that use POWER instructions on emulated machines with CPUs that are not MPC601. It is not known if any such apps exist but there could be since Apple included MPC601 emulation in classic Mac OS.
I don't know if the compiler is smart enough to figure out that ((guest_va & 0xFFF) + sizeof(T)) > 0x1000) is always false when sizeof(T) == 1 so we'll add a check for sizeof(T) > 1.