the procedure used when creating the new scans.
5.4 KiB
About the Scanned Manuals
This directory's main content is scanned versions of the original German manuals of VolksForth from the 80s. There were 4 main flavours of VolksForth and accordingly 4 manuals: C64/C16/Plus4, Atari ST, CP/M and MSDOS.
The manuals for C64/C16/Plus4, Atari ST and MSDOS have recently been rescanned and OCR-ed. Of the CP/M manual we have an older scan and an almost complete Org Mode transcript. A partial Org Mode transcript also exist of the MSDOS manual.
Based on the different text versions of the different manuals (transscripts,
sidecar files from ocrmypdf
), a translation into an English manual is being
started in the 6502/C64/doc directory for the C64 3.9.6 release. Eventually
this is intended to result in a unified manual for all versions.
Note: The mix of Org Mode and Markdown in documents here stems from different stems from different prefernces or past habits of different contributors.
VolksForth CBM 3.80 Manual
The doc/cbm/ directory contains the German manual for the C64/C16/Plus4 VolksForth version 3.80.
- vf-cbm-380-manual-de.pdf is the scanned and OCR-ed PDF.
- vf-cbm-380-manual-de.sidecar.txt
is the sidecar text output generated by
OCRmyPDF's option
--sidecar
. - raw-scans/ contains the raw PDF files as produced by the scanner from the paper orignal.
VolksForth Atari ST 3.80 Manual
The doc/atari-st/ directory contains the German manual for the Atari ST FolksForth version 3.80.
- vf-atari-st-380-manual-de.pdf is the scanned and OCR-ed PDF.
- vf-atari-st-380-manual-de.sidecar.txt
is the sidecar text output generated by
OCRmyPDF's option
--sidecar
. - raw-scans/ contains the raw PDF files as produced by the scanner from the paper orignal.
- LIESMICH.TXT is an overview, in German, of VolksForth and of the files that come with the Atari ST version. Note: The .SCR files are Forth screen files, i.e. sources, and they have since been renamed to .FB (for Forth Block source).
- README.TXT is the same, in English.
- CHANGES.ORG is a change log, in German, between versions 3.7 and 3.80.
VolksForth CP/M 3.80 Manual
The doc/cpm/ directory contains the German manual for the CP/M VolksForth version 3.80. Note that the CP/M VolksForth was shipped with the C64/C16/Plus4 manual, and the CP/M manual only describes the CP/M VolksForth's differences compared to the C64 etc. version.
- VolksForth-3.80-CPM.pdf is the scanned and OCR-ed PDF.
- readme.org is a transcript of the scanned PDF. Note that the order of the chapters differ slightly between scan and transcript.
VolksForth MSDOS 3.81 Manual
The doc/msdos/ directory contains the German manual for the MSDOS VolksForth version 3.81.
- vf-msdos-381-manual-de.pdf is the scanned and OCR-ed PDF.
- vf-msdos-381-manual-de.sidecar.txt
is the sidecar text output generated by
OCRmyPDF's option
--sidecar
. - raw-scans/ contains the raw PDF files as produced by the scanner from the paper orignal.
- LIESMICH.TXT is a partial transcript of the scanned PDF.
- README.TXT is a started cross-platform overview of VolksForth, in English.
Scanning and OCR notes
For the records, this is the procedure used to create the 3 newly-scanned PDFs:
The scans were made from 3 printed manual copies in mint condition; the manuals are in A5 format.
The scanner used is a HP Color LaserJet MFP M477fdn which has a document feeder
with two-sided scanning ability, and a fixed A4 scanning size.
Since a full VolksForth manual exceeds the capacity of the feeder,
each manual was split into 3 batches; the resulting A4 PDFs are now sitting
in the raw-scans/
directories.
The raw scans scan0000.pdf
to scan0002.pdf
were concatenated and cropped
using the Linux GUI tool pdfarranger
(version 1.4.2). Steps:
- Drag & drop all files from
raw-scans/
intopdfarranger
window. - Press ctrl-A to select all pages.
- Edit -> Crop
- Set lower margin to 29% (1 - (1 / sqrt(2)).
- Set left and right margin to 14.5% (29% / 2).
- Click "OK.
- Edit -> Edit Properties
- Set Creator to "Forth Gesellschaft e.V." (in other PDF vierers this is displayed as the Author property).
- Save as "newly-cropped.pdf"
The final searchable PDF was created from the intermediate newly-cropped.pdf
by adding an OCR text layer using OCRmyPDF:
ocrmypdf -l deu -d -c -i newly-cropped.pdfvf-<version>-manual-de.pdf --sidecar vf-<version>-manual-de.sidecar.txt
The sidecar file contains the OCR-ed text added into the text layer and is expected to be useful as input for a machine-aided translation of the manual into English.
A note about PDF versions: The raw scans are PDF-1.4, pdfarranger
outputs
PDF-1.3 which seems to cause problems (error 14) when opening files with
Adobe Acrobat. ocrmypdf
produces PDF/A-2b which does not seem to cause these
problems.