VolksForth/doc/About.md
Philip Zembrod c0d0e019f5 Add doc/About.md describing the content of the doc/ directory, as well as
the procedure used when creating the new scans.
2024-08-04 13:40:59 +02:00

5.4 KiB

About the Scanned Manuals

This directory's main content is scanned versions of the original German manuals of VolksForth from the 80s. There were 4 main flavours of VolksForth and accordingly 4 manuals: C64/C16/Plus4, Atari ST, CP/M and MSDOS.

The manuals for C64/C16/Plus4, Atari ST and MSDOS have recently been rescanned and OCR-ed. Of the CP/M manual we have an older scan and an almost complete Org Mode transcript. A partial Org Mode transcript also exist of the MSDOS manual.

Based on the different text versions of the different manuals (transscripts, sidecar files from ocrmypdf), a translation into an English manual is being started in the 6502/C64/doc directory for the C64 3.9.6 release. Eventually this is intended to result in a unified manual for all versions.

Note: The mix of Org Mode and Markdown in documents here stems from different stems from different prefernces or past habits of different contributors.

VolksForth CBM 3.80 Manual

The doc/cbm/ directory contains the German manual for the C64/C16/Plus4 VolksForth version 3.80.

VolksForth Atari ST 3.80 Manual

The doc/atari-st/ directory contains the German manual for the Atari ST FolksForth version 3.80.

  • vf-atari-st-380-manual-de.pdf is the scanned and OCR-ed PDF.
  • vf-atari-st-380-manual-de.sidecar.txt is the sidecar text output generated by OCRmyPDF's option --sidecar.
  • raw-scans/ contains the raw PDF files as produced by the scanner from the paper orignal.
  • LIESMICH.TXT is an overview, in German, of VolksForth and of the files that come with the Atari ST version. Note: The .SCR files are Forth screen files, i.e. sources, and they have since been renamed to .FB (for Forth Block source).
  • README.TXT is the same, in English.
  • CHANGES.ORG is a change log, in German, between versions 3.7 and 3.80.

VolksForth CP/M 3.80 Manual

The doc/cpm/ directory contains the German manual for the CP/M VolksForth version 3.80. Note that the CP/M VolksForth was shipped with the C64/C16/Plus4 manual, and the CP/M manual only describes the CP/M VolksForth's differences compared to the C64 etc. version.

  • VolksForth-3.80-CPM.pdf is the scanned and OCR-ed PDF.
  • readme.org is a transcript of the scanned PDF. Note that the order of the chapters differ slightly between scan and transcript.

VolksForth MSDOS 3.81 Manual

The doc/msdos/ directory contains the German manual for the MSDOS VolksForth version 3.81.

Scanning and OCR notes

For the records, this is the procedure used to create the 3 newly-scanned PDFs:

The scans were made from 3 printed manual copies in mint condition; the manuals are in A5 format.

The scanner used is a HP Color LaserJet MFP M477fdn which has a document feeder with two-sided scanning ability, and a fixed A4 scanning size. Since a full VolksForth manual exceeds the capacity of the feeder, each manual was split into 3 batches; the resulting A4 PDFs are now sitting in the raw-scans/ directories.

The raw scans scan0000.pdf to scan0002.pdf were concatenated and cropped using the Linux GUI tool pdfarranger (version 1.4.2). Steps:

  • Drag & drop all files from raw-scans/ into pdfarranger window.
  • Press ctrl-A to select all pages.
  • Edit -> Crop
    • Set lower margin to 29% (1 - (1 / sqrt(2)).
    • Set left and right margin to 14.5% (29% / 2).
    • Click "OK.
  • Edit -> Edit Properties
    • Set Creator to "Forth Gesellschaft e.V." (in other PDF vierers this is displayed as the Author property).
  • Save as "newly-cropped.pdf"

The final searchable PDF was created from the intermediate newly-cropped.pdf by adding an OCR text layer using OCRmyPDF:

ocrmypdf -l deu -d -c -i newly-cropped.pdfvf-<version>-manual-de.pdf --sidecar vf-<version>-manual-de.sidecar.txt

The sidecar file contains the OCR-ed text added into the text layer and is expected to be useful as input for a machine-aided translation of the manual into English.

A note about PDF versions: The raw scans are PDF-1.4, pdfarranger outputs PDF-1.3 which seems to cause problems (error 14) when opening files with Adobe Acrobat. ocrmypdf produces PDF/A-2b which does not seem to cause these problems.