mirror of
https://github.com/forth-ev/VolksForth.git
synced 2024-11-22 05:32:28 +00:00
Add doc/About.md describing the content of the doc/ directory, as well as
the procedure used when creating the new scans.
This commit is contained in:
parent
9be38bc7ad
commit
c0d0e019f5
122
doc/About.md
Normal file
122
doc/About.md
Normal file
@ -0,0 +1,122 @@
|
||||
# About the Scanned Manuals
|
||||
|
||||
This directory's main content is scanned versions of the original German
|
||||
manuals of VolksForth from the 80s. There were 4 main flavours of VolksForth
|
||||
and accordingly 4 manuals: C64/C16/Plus4, Atari ST, CP/M and MSDOS.
|
||||
|
||||
The manuals for C64/C16/Plus4, Atari ST and MSDOS have recently been rescanned
|
||||
and OCR-ed. Of the CP/M manual we have an older scan and an almost complete
|
||||
[Org Mode](https://orgmode.org/) transcript. A partial Org Mode transcript also
|
||||
exist of the MSDOS manual.
|
||||
|
||||
Based on the different text versions of the different manuals (transscripts,
|
||||
sidecar files from `ocrmypdf`), a translation into an English manual is being
|
||||
started in the 6502/C64/doc directory for the C64 3.9.6 release. Eventually
|
||||
this is intended to result in a unified manual for all versions.
|
||||
|
||||
Note: The mix of Org Mode and Markdown in documents here stems from different
|
||||
stems from different prefernces or past habits of different contributors.
|
||||
|
||||
## VolksForth CBM 3.80 Manual
|
||||
|
||||
The [doc/cbm/](cbm) directory contains the German manual for the C64/C16/Plus4
|
||||
VolksForth version 3.80.
|
||||
|
||||
* [vf-cbm-380-manual-de.pdf](cbm/vf-cbm-380-manual-de.pdf) is the scanned and
|
||||
OCR-ed PDF.
|
||||
* [vf-cbm-380-manual-de.sidecar.txt](cbm/vf-cbm-380-manual-de.sidecar.txt)
|
||||
is the sidecar text output generated by
|
||||
[OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF)'s option `--sidecar`.
|
||||
* [raw-scans/](cbm/raw-scans) contains the raw PDF files as produced by the
|
||||
scanner from the paper orignal.
|
||||
|
||||
## VolksForth Atari ST 3.80 Manual
|
||||
|
||||
The [doc/atari-st/](atari-st) directory contains the German manual for the
|
||||
Atari ST FolksForth version 3.80.
|
||||
|
||||
* [vf-atari-st-380-manual-de.pdf](atari-st/vf-atari-st-380-manual-de.pdf) is
|
||||
the scanned and OCR-ed PDF.
|
||||
* [vf-atari-st-380-manual-de.sidecar.txt](atari-st/vf-atari-st-380-manual-de.sidecar.txt)
|
||||
is the sidecar text output generated by
|
||||
[OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF)'s option `--sidecar`.
|
||||
* [raw-scans/](cbm/raw-scans) contains the raw PDF files as produced by the
|
||||
scanner from the paper orignal.
|
||||
* [LIESMICH.TXT](atari-st/LIESMICH.TXT) is an overview, in German,
|
||||
of VolksForth and of the files that come with the Atari ST version.
|
||||
Note: The .SCR files are Forth screen files, i.e. sources, and they have
|
||||
since been renamed to .FB (for Forth Block source).
|
||||
* [README.TXT](atari-st/README.TXT) is the same, in English.
|
||||
* [CHANGES.ORG](atari-st/CHANGES.ORG) is a change log, in German, between
|
||||
versions 3.7 and 3.80.
|
||||
|
||||
## VolksForth CP/M 3.80 Manual
|
||||
|
||||
The [doc/cpm/](cpm) directory contains the German manual for the CP/M
|
||||
VolksForth version 3.80. Note that the CP/M VolksForth was shipped with the
|
||||
C64/C16/Plus4 manual, and the CP/M manual only describes the CP/M VolksForth's
|
||||
differences compared to the C64 etc. version.
|
||||
|
||||
* [VolksForth-3.80-CPM.pdf](cpm/VolksForth-3.80-CPM.pdf) is the scanned
|
||||
and OCR-ed PDF.
|
||||
* [readme.org](cpm/readme.org) is a transcript of the scanned PDF. Note that
|
||||
the order of the chapters differ slightly between scan and transcript.
|
||||
|
||||
## VolksForth MSDOS 3.81 Manual
|
||||
|
||||
The [doc/msdos/](msdos) directory contains the German manual for the MSDOS
|
||||
VolksForth version 3.81.
|
||||
|
||||
* [vf-msdos-381-manual-de.pdf](msdos/vf-msdos-381-manual-de.pdf) is the scanned
|
||||
and OCR-ed PDF.
|
||||
* [vf-msdos-381-manual-de.sidecar.txt](msdos/vf-msdos-381-manual-de.sidecar.txt)
|
||||
is the sidecar text output generated by
|
||||
[OCRmyPDF](https://github.com/ocrmypdf/OCRmyPDF)'s option `--sidecar`.
|
||||
* [raw-scans/](msdos/raw-scans) contains the raw PDF files as produced by the
|
||||
scanner from the paper orignal.
|
||||
* [LIESMICH.TXT](msdos/LIESMICH.TXT) is a partial transcript of the scanned
|
||||
PDF.
|
||||
* [README.TXT](msdos/README.TXT) is a started cross-platform overview of
|
||||
VolksForth, in English.
|
||||
|
||||
## Scanning and OCR notes
|
||||
|
||||
For the records, this is the procedure used to create the 3 newly-scanned PDFs:
|
||||
|
||||
The scans were made from 3 printed manual copies in mint condition; the manuals
|
||||
are in A5 format.
|
||||
|
||||
The scanner used is a HP Color LaserJet MFP M477fdn which has a document feeder
|
||||
with two-sided scanning ability, and a fixed A4 scanning size.
|
||||
Since a full VolksForth manual exceeds the capacity of the feeder,
|
||||
each manual was split into 3 batches; the resulting A4 PDFs are now sitting
|
||||
in the `raw-scans/` directories.
|
||||
|
||||
The raw scans `scan0000.pdf` to `scan0002.pdf` were concatenated and cropped
|
||||
using the Linux GUI tool `pdfarranger` (version 1.4.2). Steps:
|
||||
* Drag & drop all files from `raw-scans/` into `pdfarranger` window.
|
||||
* Press ctrl-A to select all pages.
|
||||
* Edit -> Crop
|
||||
* Set lower margin to 29% (1 - (1 / sqrt(2)).
|
||||
* Set left and right margin to 14.5% (29% / 2).
|
||||
* Click "OK.
|
||||
* Edit -> Edit Properties
|
||||
* Set Creator to "Forth Gesellschaft e.V." (in other PDF vierers this is
|
||||
displayed as the Author property).
|
||||
* Save as "newly-cropped.pdf"
|
||||
|
||||
The final searchable PDF was created from the intermediate `newly-cropped.pdf`
|
||||
by adding an OCR text layer using OCRmyPDF:
|
||||
|
||||
```
|
||||
ocrmypdf -l deu -d -c -i newly-cropped.pdfvf-<version>-manual-de.pdf --sidecar vf-<version>-manual-de.sidecar.txt
|
||||
```
|
||||
|
||||
The sidecar file contains the OCR-ed text added into the text layer and is
|
||||
expected to be useful as input for a machine-aided translation of the manual
|
||||
into English.
|
||||
|
||||
A note about PDF versions: The raw scans are PDF-1.4, `pdfarranger` outputs
|
||||
PDF-1.3 which seems to cause problems (error 14) when opening files with
|
||||
Adobe Acrobat. `ocrmypdf` produces PDF/A-2b which does not seem to cause these
|
||||
problems.
|
Loading…
Reference in New Issue
Block a user