Updating READMEs.

This commit is contained in:
Rob Greene 2018-06-19 20:09:53 -05:00
parent 707b567b96
commit be6670d05a
4 changed files with 250 additions and 107 deletions

92
api/README-SHAPES.md Normal file
View File

@ -0,0 +1,92 @@
# Shape Tooling
The Shape API allows:
* Shape tables to be read in the standard binary format;
* Shape tables to be generated from "source" in three formats;
* Shape tables to be written to the standard binary format;
* Shapes and shape tables can be written to a text or image graphical representation.
## API Notes
The shape table is represented by the `ShapeTable` class which has static `read` methods.
To generate a shape table from "source" use the `ShapeGenerator` class.
The `ShapeTable` object holds a list of `Shape`s. A `Shape` can be converted to a `VectorShape`
(up, down, left, right, plot/no plot) or to a `BitmapShape` with the `Shape#toVector()` and
`Shape#toBitmap()` methods.
## Shape source
These samples define the same shape as given by Applesoft BASIC Programmer's Reference Manual - a box.
## Shape source - bitmap format
To introduce a bitmap shape, use the `.bitmap` directive.
The bitmap defines an XY grid of plot/no-plot zones. An origin may be specified and if not specified defaults to (0,0).
Notes:
* `x` = plot
* `.` = no plot; used to clarify image regions
* `+` = origin, no plot (assumed to be upper-left if unspecified)
* `*` = origin. plot
* whitespace is ignored
Sample:
```
.bitmap
.xxx.
x...x
x.+.x
x...x
.xxx.
```
## Shape source - long vector format
To introduce a long vector shape, use the `.long` directive.
Notes:
* `move`[`up`|`down`|`left`|`right`] = move vector
* `plot`[`up`|`down`|`left`|`right`] = plot vector
* whitespace is ignored
* case insensitive
* accepts a numerical argument for repetition
```
.long
movedown 2
plotleft 2
moveup
plotup 3
moveright
plotright 3
movedown
plotdown 3
moveleft
plotleft
```
## Shape source - short vector format
To introduce a short vector shape, use the `.short` directive.
Notes:
* `u`, `d`, `l`, `r` = move vector
* `U`, `D`, `L`, `R` = plot vector
* whitespace is ignored
* case sensitive
```
.short
dd
LL
uUUU
rRRR
dDDD
lL
```

104
api/README-TOKENIZER.md Normal file
View File

@ -0,0 +1,104 @@
# Tokenizer Overview
Generally, the usage pattern is:
1. Setup the `Configuration`.
2. Read the tokens.
3. Parse the tokens into a `Program`.
4. Apply transformations, if applicable.
## Code snippets
```java
Configuration config = Configuration.builder()
.sourceFile(this.sourceFile)
.build();
```
The `Configuration` class also allows the BASIC start address to be set (defaults to `0x801`), set the maximum line length (this is in bytes, and defaults to `255`, but feel free to experiment). Some of the classes report output via the debug stream, which defaults to a simple null stream (no output) - replace with `System.out` or another `PrintStream`.
```java
Queue<Token> tokens = TokenReader.tokenize(config.sourceFile);
```
The list of tokens is a loose interpretation. It includes more of a compiler sense of tokens -- numbers, end of line markers (they're significant), AppleSoft tokens, strings, comments, identifiers, etc.
```java
Parser parser = new Parser(tokens);
Program program = parser.parse();
```
The `Program` is now the parsed version of the BASIC program. Various `Visitor`s may be used to report, gather information, or manipulate the tree in various ways.
## Directives
The framework allows embedding of directives.
### `$embed`
`$embed` will allow a binary to be embedded within the resulting application *and will move it to a destination in memory*. Please note that once the application is loaded on the Apple II, the program cannot be altered as the computer will crash. Usage example:
```
5 $embed "read.time.bin", "0x0260"
```
The `$embed` directive _must_ be last on the line (if there are comments, be sure to use the `REMOVE_REM_STATEMENTS` optimization. It takes two parameters: file name and target address, both are strings.
From the `circles-timing.bas` sample, this is the beginning of the program:
```
0801:9A 09 00 00 8C 32 30 36 32 3A AB 31 00 A9 2B 85
\___/ \___/ \____________/ \___/ \_______...
Ptr, Line 0, CALL 3062, :, GOTO 1, Assembly code...
```
The move code is based on what Beagle Bros put into their [Peeks, Pokes, and Pointers](https://beagle.applearchives.com/Posters/Poster%202.pdf) poster. (See _Memory Move_ under the *Useful Calls*; the `CALL -468` entry.)
```
LDA #<embeddedStart
STA $3C
LDA #>embeddedStart
STA $3D
LDA #<embeddedEnd
STA $3E
LDA #>embeddedEnd
STA $3F
LDA #<targetAddress
STA $42
LDA #>targetAddress
STA $43
LDY #0
JMP $FE2C
```
### `$hex`
If embedding hexidecimal addresses into an application makes sense, the `$hex` directive allows that to be done in a rudimentary manner.
Sample:
```
10 call $hex "fc58"
```
Yields:
```
10 call -936
```
## Optimizations
Optimizations are mechanisms to rewrite the `Program`, typically making the program smaller. `Optimization` itself is an enum which has a `create` method to setup the `Visitor`.
Current optimizations are:
* _Remove empty statements_ will remove all extra colons. For example, if the application in question used `:` to indicate nesting. Or just accidents!
* _Remove REM statements_ will remove all comments.
* _Extract constant values_ will find all constant numerical references, insert a line `0` with assignments, and finally replace all the numbers with the approrpiate variable name. Hypothesis is that the BASIC interpreter only parses the number once.
* _Merge lines_ will identify all lines that are not a target of `GOTO`/`GOSUB`-type action and rewrite the line by merging it with others. The concept involved is that the BASIC program is just a linked list and shortening the list will shorten the search path. The default *max length* in bytes is set to `255`.
* _Renumber_ will renumber the application, beginning with line `0`. This makes the decoding a tiny bit more efficient in that the number to decode will be smaller in the token stream.
Sample use:
```java
program = program.accept(Optimization.REMOVE_REM_STATEMENTS.create(config));
```

View File

@ -1,4 +1,4 @@
# BT API
# BASIC Tools API
The BASIC Tools API is a set of reusable code that can be used to parse a text-based Applesoft BASIC program an generate the appropriate tokens. It also has multiple types of visitors that can re-write that parse tree to rearrange the code (calling them optimizations is a bit over-the-top).
@ -24,107 +24,9 @@ dependencies {
}
```
## Overview
## API descriptions
Generally, the usage pattern is:
1. Setup the `Configuration`.
2. Read the tokens.
3. Parse the tokens into a `Program`.
4. Apply transformations, if applicable.
Currently the API is broken into the following sections:
## Code snippets
```java
Configuration config = Configuration.builder()
.sourceFile(this.sourceFile)
.build();
```
The `Configuration` class also allows the BASIC start address to be set (defaults to `0x801`), set the maximum line length (this is in bytes, and defaults to `255`, but feel free to experiment). Some of the classes report output via the debug stream, which defaults to a simple null stream (no output) - replace with `System.out` or another `PrintStream`.
```java
Queue<Token> tokens = TokenReader.tokenize(config.sourceFile);
```
The list of tokens is a loose interpretation. It includes more of a compiler sense of tokens -- numbers, end of line markers (they're significant), AppleSoft tokens, strings, comments, identifiers, etc.
```java
Parser parser = new Parser(tokens);
Program program = parser.parse();
```
The `Program` is now the parsed version of the BASIC program. Various `Visitor`s may be used to report, gather information, or manipulate the tree in various ways.
## Directives
The framework allows embedding of directives.
### `$embed`
`$embed` will allow a binary to be embedded within the resulting application *and will move it to a destination in memory*. Please note that once the application is loaded on the Apple II, the program cannot be altered as the computer will crash. Usage example:
```
5 $embed "read.time.bin", "0x0260"
```
The `$embed` directive _must_ be last on the line (if there are comments, be sure to use the `REMOVE_REM_STATEMENTS` optimization. It takes two parameters: file name and target address, both are strings.
From the `circles-timing.bas` sample, this is the beginning of the program:
```
0801:9A 09 00 00 8C 32 30 36 32 3A AB 31 00 A9 2B 85
\___/ \___/ \____________/ \___/ \_______...
Ptr, Line 0, CALL 3062, :, GOTO 1, Assembly code...
```
The move code is based on what Beagle Bros put into their [Peeks, Pokes, and Pointers](https://beagle.applearchives.com/Posters/Poster%202.pdf) poster. (See _Memory Move_ under the *Useful Calls*; the `CALL -468` entry.)
```
LDA #<embeddedStart
STA $3C
LDA #>embeddedStart
STA $3D
LDA #<embeddedEnd
STA $3E
LDA #>embeddedEnd
STA $3F
LDA #<targetAddress
STA $42
LDA #>targetAddress
STA $43
LDY #0
JMP $FE2C
```
### `$hex`
If embedding hexidecimal addresses into an application makes sense, the `$hex` directive allows that to be done in a rudimentary manner.
Sample:
```
10 call $hex "fc58"
```
Yields:
```
10 call -936
```
## Optimizations
Optimizations are mechanisms to rewrite the `Program`, typically making the program smaller. `Optimization` itself is an enum which has a `create` method to setup the `Visitor`.
Current optimizations are:
* _Remove empty statements_ will remove all extra colons. For example, if the application in question used `:` to indicate nesting. Or just accidents!
* _Remove REM statements_ will remove all comments.
* _Extract constant values_ will find all constant numerical references, insert a line `0` with assignments, and finally replace all the numbers with the approrpiate variable name. Hypothesis is that the BASIC interpreter only parses the number once.
* _Merge lines_ will identify all lines that are not a target of `GOTO`/`GOSUB`-type action and rewrite the line by merging it with others. The concept involved is that the BASIC program is just a linked list and shortening the list will shorten the search path. The default *max length* in bytes is set to `255`.
* _Renumber_ will renumber the application, beginning with line `0`. This makes the decoding a tiny bit more efficient in that the number to decode will be smaller in the token stream.
Sample use:
```java
program = program.accept(Optimization.REMOVE_REM_STATEMENTS.create(config));
```
* [BASIC Tokenizer](README-TOKENIZER.md)
* [Shape Tooling](README-SHAPES.md)

View File

@ -13,15 +13,47 @@ Usage: st [-hV] [--debug] [COMMAND]
Shape Tools utility
Options:
--debug Dump full stack trackes if an error occurs
--debug Dump full stack traces if an error occurs
-h, --help Show this help message and exit.
-V, --version Print version information and exit.
Commands:
extract Extract shapes from shape table
generate Generate a shape table from source code
help Displays help information about the specified command
```
## Sub-command help
```shell
$ st extract --help
Usage: st extract [-h] [--skip-empty] [--stdin] [--stdout]
[--border=<borderStyle>] [--format=<outputFormat>]
[--shape=<shapeNum>] [-o=<outputFile>] [-w=<width>]
[<inputFile>]
Extract shapes from shape table
Parameters:
[<inputFile>] File to process
Options:
--border=<borderStyle>
Set border style (none, simple, box)
Default: simple
--format=<outputFormat>
Select output format (text, png, gif, jpeg, bmp, wbmp)
Default: text
--shape=<shapeNum> Extract specific shape
--skip-empty Skip empty shapes
--stdin Read from stdin
--stdout Write to stdout
-h, --help Show help for subcommand
-o, --output=<outputFile>
Write output to file
-w, --width=<width> Set width (defaults: text=80, image=1024)
```
## Text extract
```shell
@ -77,3 +109,16 @@ $ st --debug extract --shape 3 --output robot.png --format png --border box ~/Do
$ st --debug extract --output=new-mouse-shapes.png --border=box --skip-empty --format=png ~/Downloads/shapes/NEW\ MOUSE
```
![All shapes](images/new-mouse-shapes.png "All shapes")
## Shape generation
```shell
$ st generate --stdout api/src/test/resources/box-longform.st | st extract --stdin --stdout
+-----+
|.XXX.|
|X...X|
|X.+.X|
|X...X|
|.XXX.|
+-----+
```