ciderpress/app/Help/html/t272.htm

<HTML><HEAD>
	
	<TITLE>Tool - EOL Scanner</TITLE>
	<OBJECT TYPE="application/x-oleobject" CLASSID="clsid:1e2a7bd0-dab9-11d0-b93a-00c04fc99f9e">
		<PARAM NAME="Keyword" VALUE="eol">
		<PARAM NAME="Keyword" VALUE="scan">
	</OBJECT>
	<META NAME="AUTHOR" CONTENT="Copyright (C) 2014 by CiderPress authors">
	<META NAME="GENERATOR" CONTENT="HelpScribble 7.8.8">
	<STYLE> span { display: inline-block; }</STYLE>
</HEAD>
<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="4">EOL Scanner</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">EOL is an acronym for "end of line".&nbsp; Each line of a text file ends with an end-of-line marker.&nbsp; The Apple II and Macintosh use a carriage return, UNIX systems use a linefeed, and Windows uses a carriage return followed by a linefeed.&nbsp; These are usually abbreviated "CR", "LF", and "CRLF".</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">When text files are moved from one system to another, the end-of-line markers on text files need to be converted.&nbsp; Unfortunately, overzealous converters will sometimes convert a non-text file, resulting in corrupted data.&nbsp; The only way to tell if a file has been corrupted is to count up the occurrences of CR, LF, and CRLF, and see if they make sense.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">The EOL Scanner tool does exactly that.&nbsp; The number of times CR, LF, and CRLF appear are counted and displayed.&nbsp; This information, combined with some knowledge of the format of the file, will tell you if a file has been corrupted by an EOL conversion.&nbsp; The tool also counts up "high-ASCII" characters to test for conversions to and from DOS text files.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Take for example a healthy 140K disk image.&nbsp; The scanner reports the following:</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>143360 characters</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>43582 high-ASCII characters</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>381 carriage returns</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>863 line feeds</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>3 CRLFs</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">A typical disk image will have a smattering of CR (hex value 0x0d) and LF (hex value 0x0a).&nbsp; Occasionally they will appear near each other and form a CRLF.&nbsp; A disk image with lots of text files will have many more CRs than LFs, while a disk with programs on it could have any amount of either.&nbsp; (To be accurate, a DOS 3.3 disk with text files won't show a large number of CRs, because DOS 3.3 uses "high ASCII" 0x8d instead of 0x0d.)&nbsp; Note that an occurrence of "CRLF" only updates the CRLF counter, not the CR and LF counters.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Now lets look at a disk image that doesn't seem to work.&nbsp; The scanner comes back with:</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>143360 characters</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>56085 high-ASCII characters</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>530 carriage returns</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>0 line feeds</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2">  </span>0 CRLFs</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">The disk image has absolutely no line feeds whatsoever.&nbsp; A blank formatted ProDOS disk will have 3 or 4 carriage returns and line feeds, a blank unbootable DOS disk 1 of each.&nbsp; A non-empty disk should never have zero CRs or LFs.&nbsp; This disk was corrupted by a converter that changed all of the linefeeds to carriage returns.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Similar results hold for compressed data archives.&nbsp; Data that is well compressed will show a fairly even distribution of all possible characters, so a total absence of CR or LF is a big red flag.&nbsp; Of course, if the archive is only a few hundred bytes long, it's quite possible that no CRs or LFs will be found.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Disk images in nibble format (.nib) have no CRs or LFs in them.&nbsp; This is normal.&nbsp; The entire file should be "high ASCII" characters.</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">&nbsp;</FONT></P>
<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Besides its use as a diagnostic tool, the EOL Scanner can also tell you what format a text file is in, and also how many lines it has.
</FONT>
</P>
</BODY></HTML>
WinHelp to HtmlHelp conversion, part 1 The original version of CiderPress used a WinHelp help file, built with an application called HelpMatic Pro. This app used a proprietary format, and had no facility for exporting to "raw" HPJ + RTF files, so I decompiled the HLP and imported it into HelpScribble. Using HelpScribble, I cleaned up the help file formatting a little, fixed up the table of contents, and exported as "raw" HtmlHelp (HHP, HHK, HHC, and a whole bunch of HTML). I also split the pop-up help text, which isn't supported by HelpScribble, into a separate text file that Microsoft's HTML Help Workshop understands. I'm checking in the files that HTML Help Workshop needs to generate a CHM, so anyone can update the help text. I'm also checking in the CHM file, rather than adding the help workshop to the build, so that it's not necessary to download and configure the help workshop to build CiderPress. This change adds all of the updated help, but only updates the Help and question mark button actions for one specific dialog. A subsequent change will update the rest of the dialogs. This change is essentially upgrading us from a totally obsolete help system to a nearly-obsolete help system, but the systems are similar enough to make this a useful half-step on the way to something else. The code will centralize help activation in a pair of functions in the main app class, so any future improvements should be more limited in scope. This also adds a build step to copy the CHM to the execution directory. 2014-12-09 06:34:34 +00:00			`<HTML><HEAD>`

			`<TITLE>Tool - EOL Scanner</TITLE>`
			`<OBJECT TYPE="application/x-oleobject" CLASSID="clsid:1e2a7bd0-dab9-11d0-b93a-00c04fc99f9e">`
			`<PARAM NAME="Keyword" VALUE="eol">`
			`<PARAM NAME="Keyword" VALUE="scan">`
			`</OBJECT>`
			`<META NAME="AUTHOR" CONTENT="Copyright (C) 2014 by CiderPress authors">`
			`<META NAME="GENERATOR" CONTENT="HelpScribble 7.8.8">`
			`<STYLE> span { display: inline-block; }</STYLE>`
			`</HEAD>`
			`<BODY BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000">`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="4">EOL Scanner</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">EOL is an acronym for "end of line".  Each line of a text file ends with an end-of-line marker.  The Apple II and Macintosh use a carriage return, UNIX systems use a linefeed, and Windows uses a carriage return followed by a linefeed.  These are usually abbreviated "CR", "LF", and "CRLF".</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">When text files are moved from one system to another, the end-of-line markers on text files need to be converted.  Unfortunately, overzealous converters will sometimes convert a non-text file, resulting in corrupted data.  The only way to tell if a file has been corrupted is to count up the occurrences of CR, LF, and CRLF, and see if they make sense.</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">The EOL Scanner tool does exactly that.  The number of times CR, LF, and CRLF appear are counted and displayed.  This information, combined with some knowledge of the format of the file, will tell you if a file has been corrupted by an EOL conversion.  The tool also counts up "high-ASCII" characters to test for conversions to and from DOS text files.</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Take for example a healthy 140K disk image.  The scanner reports the following:</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>143360 characters</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>43582 high-ASCII characters</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>381 carriage returns</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>863 line feeds</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>3 CRLFs</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">A typical disk image will have a smattering of CR (hex value 0x0d) and LF (hex value 0x0a).  Occasionally they will appear near each other and form a CRLF.  A disk image with lots of text files will have many more CRs than LFs, while a disk with programs on it could have any amount of either.  (To be accurate, a DOS 3.3 disk with text files won't show a large number of CRs, because DOS 3.3 uses "high ASCII" 0x8d instead of 0x0d.)  Note that an occurrence of "CRLF" only updates the CRLF counter, not the CR and LF counters.</FONT></P>
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Now lets look at a disk image that doesn't seem to work.  The scanner comes back with:</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>143360 characters</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>56085 high-ASCII characters</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>530 carriage returns</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>0 line feeds</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><SPAN STYLE="width: 17pt"><FONT FACE="MS Sans Serif" SIZE="2"> </span>0 CRLFs</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">The disk image has absolutely no line feeds whatsoever.  A blank formatted ProDOS disk will have 3 or 4 carriage returns and line feeds, a blank unbootable DOS disk 1 of each.  A non-empty disk should never have zero CRs or LFs.  This disk was corrupted by a converter that changed all of the linefeeds to carriage returns.</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Similar results hold for compressed data archives.  Data that is well compressed will show a fairly even distribution of all possible characters, so a total absence of CR or LF is a big red flag.  Of course, if the archive is only a few hundred bytes long, it's quite possible that no CRs or LFs will be found.</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Disk images in nibble format (.nib) have no CRs or LFs in them.  This is normal.  The entire file should be "high ASCII" characters.</FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2"> </FONT></P>`
			`<P STYLE="margin-top:0;margin-bottom:0;"><FONT FACE="MS Sans Serif" SIZE="2">Besides its use as a diagnostic tool, the EOL Scanner can also tell you what format a text file is in, and also how many lines it has.`
			`</FONT>`
			`</P>`
			`</BODY></HTML>`