Commit work-in-progress.

Some improvements made on accuracy of circuit descriptions.
This commit is contained in:
Andrew Makousky 2020-12-22 18:05:27 -06:00
parent 6a74bb5a13
commit b957f0cb84
2 changed files with 112 additions and 147 deletions

View File

@ -90,8 +90,8 @@ a huge number of pins, its purpose can be summarized as follows.
Also, note that the address multiplexing is configured as follows:
Row access strobe: A1, A11, A12, A13, A14, A15, A16, RA7*, A18, RA9*.
Column access strobe: A2, A3, A4, A5, A6, A7, A8, RA7*, A10, RA9*.
Row access strobe: A2, A3, A4, A5, A6, A7, A8, RA7*, A10, RA9*.
Column access strobe: A1, A11, A12, A13, A14, A15, A16, RA7*, A18, RA9*.
RA7 and RA9 are controlled directly by the BBU rather than being
wired through the F257 address multiplexers. Now, the devil is in
@ -107,18 +107,29 @@ a huge number of pins, its purpose can be summarized as follows.
configuration of one row of DRAM SIMMs, this means you can configure
a Macintosh SE with only 128K of RAM. Hilarious!
If only 256K RAM SIMMs are installed, then RA7 is either A17 (row)
or A9 (column). RA9 is not used. If there are two rows of DRAM
If only 256K RAM SIMMs are installed, then RA7 is either A9 (row) or
A17 (column). RA9 is not used. If there are two rows of DRAM
SIMMs, A19 is used to determine which one to use.
If 1MB RAM SIMMs are installed, then RA7 is either A17 (row) or A9
(column). RA9 is either A20 (row) or A19 (column). If there are
If 1MB RAM SIMMs are installed, then RA7 is either A9 (row) or A17
(column). RA9 is either A19 (row) or A20 (column). If there are
two rows of DRAM SIMMs, A21 is used to determine which one to use.
Since RAM accesses are always on even bytes in 16-bit quantities, A0
is implied to be zero and therefore also is not available on the
CPU.
The purpose of this weird scrambled address mapping is to satisfy
two goals:
1. To allow the framebuffer scanning circuitry to double as DRAM
refresh.
2. To enable fast-page mode (FPM) for fetching two 16-bit words in
sequence (one "longword"). This in turn reduces the BBU's
memory access overhead and therefore increases the speed of CPU
memory accesses.
* So, wow. Here's a list of all possible Macintosh SE RAM
configurations.

View File

@ -230,34 +230,6 @@ module bbu_master_ctrl
// Installed RAM size.
wire [23:0] ramsz;
// SCSI support: Handle chip select, and handle DMA. Important!
// Never have *DACK and *SCSI active simultaneously. Okay, so
// let's get this straight. When we would normally assert *DTACK,
// we release *SCSI and wait for SCSI.DRQ. Then when we receive
// it, we can assert *DTACK and also *DACK, with a timer to
// deassert *DACK. Important! Make sure we do not go faster than
// the minimum read/write pulse width of the SCSI chip.
// Important! How to handle SCSI DMA... according to Guide to the
// Macintosh family hardware, page 126, this is "pseudo-DMA" mode.
// The BBU does not assert `*DTACK` until `SCSIDRQ` is received to
// indicate the DMA transfer is complete. And again, noting from
// page 126, . Likewise, `*DTACK` is asserted for all addresses in
// range, even if nothing is mapped. No stringent bus error
// control here, that's a hobbyist extension. And again, inded
// `*DTACK` is tri-stated according to the manual when `*EXTDTK`
// (`*EXT.DTACK`) is asserted.
// TODO: How do we know if DMA mode is true? Do we have to snoop
// the address but to get this information.
// Now, here's the golden rule: "If any access has not terminated
// within 265 ms, the BBU asserts the bus error signal /BERR."
// There you go, that's how it is driven, though the condition only
// happens for PDS and SCSI accesses. So, another thing,
// yes... there must be at least a nominal delay to assert `*DTACK`
// to allow PDS cards to intervene by asserting `*EXTDTK` first.
// TODO MOVE DOCUMENTATION: PLEASE NOTE, PDS cards can also access
// DRAM, not just the CPU. This is mainly a matter of bus
// arbitration, then as far s the BBU is concerned, PDS access to
@ -668,12 +640,6 @@ endmodule
* 0xf00000 - 0xffffef: ??? (the ROM appears to be accessing here)
* 0xfffff0 - 0xffffff: Auto Vector
TODO FIXME: Note that SCSI chip enable is NOT asserted when A9 is
one, and Macintosh Plus asserts DACK when A9 is one, but not when
it is zero. Okay, so I think I have that figured out. Add 512 for
DMA mode access logic, otherwise we do not implement DMA access
mode at all.
This address map has also been confirmed with Guide to the
Macintosh family hardware, page 127. PLEASE NOTE: In SCC Read
zone, if A0 == 1, then that is an SCC RESET. IWM must be A0 == 1,
@ -697,40 +663,6 @@ endmodule
overflow accesses should just wrap around and repeat access to the
same RAM.
PLEASE NOTE: Guide to the Macintosh family hardware, page 122. "A
word-wide access to any SCC address causes a phase shift in the
processor clock, and is used by the operating system to correct the
phase when necessary."
"At system startup, the operating system reads an address in the
range $F0 0000 through $F7 FFFF (labeled _Phase read_ in Gifgures
3-1 and 3-2) to determine whether the computer's high-frequency
timing signals are correctly in phase. When the timing signals are
not in phase, RAM accesses are not timed correctly, causing an
unstable video display, RAM errors, and VIA errors." Well, I can
see how that would be happening with just a bunch of PALs, but I
don't think it still needs to be that way when you have the BBU in
charge, you can do better! And indeed, that note only appears to
apply to the Macintosh Plus, not the Macintosh SE, as it is listed
in that section.
But, for the sake of Macintosh Plus recreation, please note. The
TSG PAL places one of the high-frequency phase indicator signals on
D0 of the address bus, I assume it is the 1 MHz *PMCYC signal. A
multiple address read instruction is used to read three consecutive
data values from the address bus in synchronous I/O mode (due to
using address 0xf00000), so each address read is either 10 or 20
CPU cycles long. This will sample the phase at a few different
points. The phase readings are added together, if they are zero or
one, then we are "in-phase." Otherwise, phase readings 2 and 3 are
considered "out-of-phase" so we access a word-width address in the
SCC range to shift the high frequency timing by 128 ns (one CPU
clock cycle at 8 MHz).
The important thing to remember is that every MC68000 instruction
executes for an even number of clock cycles (divisible by 2), and
there is no pipelining in these early CPUs.
TODO FIXME: Guide to the Macintosh family hardware, page 127.
Okay, so this is how to interpret the information about the
boot-time overlay for the alternate RAM location. Only a 2MB zone
@ -744,15 +676,47 @@ endmodule
I don't really quite understand, though, sorry. Okay, this means,
the first row, right? "If 2.5 or 4 MB only upper row is
accessible" page 127.
The signal *VPA is asserted in address range 0xe00000 - 0xffffff,
optionally excluding invalid addresses when a bus error signal is
generated. This is for synchronous I/O devices accessed in the old
6800 fashion.
----------
But, for the sake of Macintosh Plus recreation, please note.
Guide to the Macintosh family hardware, page 122. For the
Macintosh Plus: "A word-wide access to any SCC address causes a
phase shift in the processor clock, and is used by the operating
system to correct the phase when necessary."
"At system startup, the operating system reads an address in the
range $F0 0000 through $F7 FFFF (labeled _Phase read_ in figures
3-1 and 3-2) to determine whether the computer's high-frequency
timing signals are correctly in phase. When the timing signals are
not in phase, RAM accesses are not timed correctly, causing an
unstable video display, RAM errors, and VIA errors." Well, I can
see how that would be happening with just a bunch of PALs, but I
don't think it still needs to be that way when you have the BBU in
charge, you can do better! And indeed, that note only appears to
apply to the Macintosh Plus, not the Macintosh SE, as it is listed
in that section.
The TSG PAL places one of the high-frequency phase indicator
signals on D0 of the address bus, I assume it is the 1 MHz *PMCYC
signal. A multiple address read instruction is used to read three
consecutive data values from the address bus in synchronous I/O
mode (due to using address 0xf00000), so each address read is
either 10 or 20 CPU cycles long. This will sample the phase at a
few different points. The phase readings are added together, if
they are zero or one, then we are "in-phase." Otherwise, phase
readings 2 and 3 are considered "out-of-phase" so we access a
word-width address in the SCC range to shift the high frequency
timing by 128 ns (one CPU clock cycle at 8 MHz).
The important thing to remember is that every MC68000 instruction
executes for an even number of clock cycles (divisible by 2), and
there is no pipelining in these early CPUs.
*/
module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
n_sccen, n_sccrd, n_iow, n_iwm, via_cs1, n_vpa,
n_berr, n_as, a23_19, berr_ram, n_extdtk,
module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi, scsidrq,
n_dack, n_sccen, n_sccrd, n_iow, n_iwm, via_cs1,
n_vpa, n_berr, n_as, a23_19, a9, n_extdtk,
boot_overlay, r_n_w, reg_romen, reg_ram_w,
n_dtack_peri);
input wire n_res;
@ -760,6 +724,8 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
output wire n_ramen;
output wire n_romen;
output wire n_scsi;
input wire scsidrq;
output wire n_dack;
output wire n_sccen;
output wire n_sccrd;
output wire n_iow;
@ -769,7 +735,7 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
output wire n_berr;
input wire n_as;
input wire [4:0] a23_19;
input wire berr_ram; // Would this RAM address be a bus error?
input wire a9;
input wire n_extdtk;
input wire boot_overlay;
input wire r_n_w;
@ -781,14 +747,14 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
output wire n_dtack_peri; // *DTACK for peripherals
wire reg_ram, reg_ram_r;
wire scdma; // host requested performing a SCSI pseudo-DMA read/write
wire scsi, sccrd, sccwr;
wire berr_scc;
wire n_dtack_peri_pt; // *DTACK peripherals "pre-trigger"
reg n_dtack_peri_bf; // *DTACK for peripherals buffer
reg [31:0] berr_cntr; // Counter to trigger `*BERR` after 265 ms
// If the boot-time overlay is enabled but we attempt to write to
// the regular RAM region, then this is a *RAMEN trigger. The
// the regular RAM region, then this is a `*RAMEN` trigger. The
// overlay control logic will zero the switch on the next cycle,
// but we use combinatorial logic here to act immediately.
assign reg_ram = ~n_as & (a23_19[4:3] == 2'b00);
@ -799,10 +765,19 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
(a23_19[4:1] == 4'h7))) | reg_ram_w) :
reg_ram);
assign reg_romen = ~n_as & (a23_19[4:1] == 4'h4);
// Only trigger *ROMEN for reads, not writes, in overlay zone.
// Only trigger `*ROMEN` for reads, not writes, in overlay zone.
assign n_romen = ~(reg_romen | (boot_overlay & reg_ram_r));
assign scsi = ~n_as & (a23_19[4:0] == 5'b01011);
assign n_scsi = ~scsi;
// Note that the SCSI chip enable is NOT asserted in pseudo-DMA
// access mode, which is indicated by A9 (add decimal 512 to base
// address).
assign scdma = scsi & a9;
// assign n_scsi = ~(scsi & ~scdma);
assign n_scsi = ~(scsi & ~a9); // simplification
// N.B.: One idea I had was to use a timer to de-assert `*DACK` to
// ensure it is not held too long. For now I am assuming this is
// not necessary given the MC68000 address bus speed.
assign n_dack = scdma & scsidrq;
assign sccrd = ~n_as & (a23_19[4:1] == 4'h9);
assign sccwr = ~n_as & (a23_19[4:1] == 4'hb);
assign n_sccen = ~(sccrd | sccwr);
@ -810,7 +785,11 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
assign n_iow = ~((scsi & ~r_n_w) | sccwr);
assign n_iwm = ~(~n_as & (a23_19[4:1] == 4'hd));
assign via_cs1 = ~n_as & (a23_19[4:0] == 5'b11101);
assign n_vpa = ~(via_cs1 | (~n_as & (a23_19[4:1] == 4'hf)));
// The signal *VPA is asserted in address range 0xe00000 -
// 0xffffff. This is for synchronous I/O devices accessed in the
// old 6800 fashion.
assign n_vpa = ~(~n_as & ((a23_19[4:1] == 4'he) |
(a23_19[4:1] == 4'hf)));
// N.B.: According to Guide to the Macintosh family hardware, page
// 126, the implementation of Auto Vector is easy and
// straightforward for the BBU. Just assert `*VPA` in the address
@ -822,19 +801,12 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
// write-only SCC address space, and vice versa.
assign berr_scc = (sccrd & ~r_n_w) | (sccwr & r_n_w);
// Note that if the PDS card asserts *EXTDTK, we also must not
// drive *BERR. TODO EVALUATE: Should we wait a cycle before
// asserting *BERR to give the PDS card time to respond first?
assign n_berr = n_as | ~n_extdtk |
~((~n_ramen & berr_ram) |
berr_scc |
(a23_19[4:0] == 5'b01010) |
(a23_19[4:1] == 4'h7) |
(a23_19[4:1] == 4'h8) |
(a23_19[4:1] == 4'ha) |
(a23_19[4:1] == 4'hc) |
(a23_19[4:0] == 5'b11100));
// TODO: Also flag bus errors for the final address zone.
// Here's the rule on asserting `*BERR`, Guide to the Macintosh
// Family hardware, page 126: "If any access has not terminated
// within 265 ms, the BBU asserts the bus error signal /BERR." In
// practice, this would possibly only occur for PDS and SCSI
// accesses.
assign n_berr = ~(berr_cntr >= 4240000);
// NOTE: For all peripherals, we must set `*DTACK` from the BBU
// upon successful access condition and time durations because it
@ -845,25 +817,30 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
// N.B.: According to Guide to the Macintosh family hardware,
// `*DTACK` is not used to respond to addresses in the range
// 0xe00000 - 0xffffff, only below that. `*VPA` alone is used to
// respond to these addresses. So therefore, we exclude `VIA.CS1`.
assign n_dtack_peri_pt = n_scsi & n_sccen & n_iwm;
// N.B. We use combinatorial logic here to deassert *DTACK for
// peripherals as soon as *AS is released.
assign n_dtack_peri = n_as | n_dtack_peri_bf;
// respond to these addresses. Except for the case of SCSI
// pseudo-DMA accesses, we only assert `*DTACK` when we assert
// `*DACK`.
assign n_dtack_peri = n_as | ((n_vpa | scdma) & n_dack);
// N.B. For now, we leave it to higher level logic to put on
// `*DTACK` on high-impedance when `*EXTDTK` is asserted.
// assign n_dtack_peri =
// n_extdtk ? (n_as | ((n_vpa | scdma) & n_dack)) : 'bz;
always @(negedge n_res) begin
n_dtack_peri_bf <= 1;
berr_cntr <= 0;
end
always @(posedge clk) begin
if (n_res) begin
if (n_as)
n_dtack_peri_bf <= 1;
berr_cntr <= 0;
else begin
if (n_dtack_peri_pt)
n_dtack_peri_bf <= 0;
if (berr_cntr < 4240000) // # of 16 MHz pulses in 265 ms
berr_cntr <= berr_cntr + 1;
else
n_dtack_peri_bf <= 1;
; // Nothing to be done.
end
end
else
@ -871,38 +848,6 @@ module decode_devaddr (n_res, clk, n_ramen, n_romen, n_scsi,
end
endmodule
// Determine if a RAM address is out of range and should therefore
// signal a bus error.
module berr_ram_logic (a0_21, ramsz, /*ramsz_en,*/ berr_ram);
input wire [21:0] a0_21;
input wire [23:0] ramsz;
// input wire [6:0] ramsz_en;
output wire berr_ram;
// We could either use arithmetic comparison (easiest to code in
// Verilog), or bit-wise comparisons (possibly more efficient in
// hardware). We may simply consider implementing this logic in a
// separate module.
// 4MB valid: 0x000000 - 0x3fffff
// 4MB invalid: None!
// 2.5MB valid: 0x000000 - 0x27ffff
// 2.5MB invalid: 0x280000 - 0x3fffff
// 2MB valid: 0x000000 - 0x1fffff
// 2MB invalid: 0x200000 - 0x3fffff
// 1MB valid: 0x000000 - 0x0fffff
// 1MB invalid: 0x100000 - 0x3fffff
// 512K valid: 0x000000 - 0x07ffff
// 512K invalid: 0x080000 - 0x3fffff
// 256K valid: 0x000000 - 0x03ffff
// 256K invalid: 0x040000 - 0x3fffff
// 128K valid: 0x000000 - 0x01ffff
// 128K invalid: 0x020000 - 0x3fffff
// N.B. Verilog comparison is unsigned by default.
assign berr_ram = { 1'b0, a0_21[21:17] } < ramsz[22:17];
endmodule
// Boot-time memory overlay register and controlling logic. This is
// fairly straightforward to implement once you see all the other
// logic of the BBU in place.
@ -959,6 +904,7 @@ endmodule
// RA7/RA9 selector logic. Determine which CPU address pins should be
// routed to these RAM address pins based off of the installed RAM.
// TODO FIXME: This is incorrect in light of new knowledge.
module dramctl_ra7_9 (ra7, ra9, cas_n_ras, row2, mbram, s64kram,
a9, a17, a19, a20, a10);
output wire ra7;
@ -1551,7 +1497,7 @@ endmodule
/* TODO: Summary of what is missing and left to implement: DRAM
initialization pulses, DRAM refresh, detect 2.5MB of RAM and
configure address buffers accordingly, video, disk, and audio
scanout, SCSI DMA, EXTDTK yielding.
scanout, EXTDTK yielding.
Okay, so the VERDICT on DRAM initialization pulses. We don't
actually use these as we should, strictly speaking, but why does it
@ -1602,8 +1548,16 @@ So, how many longwords for the video framebuffer?
In hex: 0x1560
Number of address bits fully covered by a full scan: 12
Okay, so the question, does it work for DRAM refresh? Indeed it
does! Well, at least for <=1MB of RAM.
Okay, so the question, does it work for DRAM refresh? Indeed it does!
Well, the number of longwords swept is great enough to cover all DRAM
rows for 4MB of RAM, but the address bit mapping appears only to work
for <=1MB of RAM.
Please note that since we use only a single row access strobe signal
for both DRAM rows and instead use separate column access strobes to
differentiate between the rows, even if all the video memory addresses
are only in one row, we still refresh the other row as long as we
cover all the row addresses.
RA9 looks to be trouble. But, the Unitron reverse engineering docs
almost have a solution. Set this to A17 (?) and it should "just work"