executor/docs/cleanup_ideas

648 lines
18 KiB
Plaintext

Cleanup goals:
- Consistency
- Cleanliness
- Conciseness
- Modularity
- easier to maintain and extend
- easier to remove and license/sell parts of Executor (e.g. HFS code)
- Robustness
- Portability
- Debuggability
- Superior tools
- Nicer learning curve for newcomers
- Reduced redundancy in the source
- A more efficient Executor binary
- smaller
- faster
- better locality of reference
- smaller memory footprint
- faster load times on demand-paged systems
----------------------------------------------------------------------
Coding standards:
We will use the coding conventions described in the GNU coding
standards, with some extensions:
- You may assume GNU make and gcc. Gratuitously using features
specific to either is a bad idea, but if using a GNU extension is
clearly the best (or only) way to write something, don't hesitate to
use it.
- Always use full ANSI C prototypes. When a function takes no arguments,
place an explicit "void" in the argument list both in the declaration
and in the definition.
- You may assume alloca is present, but do not assume you have a large
amount of stack space. Under djgpp we have only a default stack of
256K, which is why the rsys/tempalloc.h package was created.
- You may assume ANSI C libraries. We will never compile Executor on
a system that doesn't have ANSI libraries. As such, know what ANSI
provides and use those library functions when appropriate (instead
of hand-rolled loops, etc.) Use ANSI functions rather than their
old Unix counterparts when possible (e.g. memset rather than bzero,
and memcpy or memmove rather than bcopy).
- All source files will be maintained with RCS. This of course does
not apply to files which are mechanically generated from other
files. It may sometimes be appropriate to log derived files with
RCS, but usually it's not.
- Make functions and "global" variables `static' when possible.
This has the following benefits:
- gcc can warn about unused static items (which only waste space).
- It reduces the chance of an identifier clash with a symbol
in another source file.
- Use `gui_assert' and `warning_unexpected' statements liberally.
They take little time to add in, but can be an invaluable debugging
aid.
- Don't modify function parameters. This can make debugging confusing
as it makes it unclear what the original argument to the function
was.
- All functions will have exactly one `return' statement, counting the
implicit `return' at the end of a void function. This is to make
debugging easier (e.g. setting a breakpoint to see when a certain
value is returned), and to avoid certain kinds of bugs that can
happen with early exits.
- `return' statements should never return anything more complex than
a single variable or constant. This is to facilitate setting
debugger breakpoints.
- If you are writing a function that returns something by reference,
always fill that value in, even in the case of an error when you
don't expect that value to be used. Uninitialized memory errors can
be frustrating to track down, and gcc won't warn you about them in
this case. An exception of course would be routines that are
explicitly not supposed to modify the return-by-reference memory in
the case of an error.
- `goto' is often to be avoided, but don't hesitate to use it if it
makes your code simpler. For example, if you want to return an
error code from deep within nested `if' statements, set a variable
indicating the return code and `goto' the end of the function where
that variable is returned. kludges involving magic loop variables
or smashing loop counters and executing break statements are usually
more difficult to understand than a simple goto. A goto will
continue to work even if another nested layer of looping is added
outside, or if code is added after the loop that you really didn't
want to execute.
- Machine-generated code need not conform to any particular coding
standard, although it's more readable if it does. Such code should
at least be run through `indent' (if appropriate) to make it more
human-readable.
- Always declare variables in the innermost possible lexical scope.
- Never shadow identifiers in outer scopes.
- Never use the same variable name in two scopes in the same function
(it's OK when the variable is a magic temp variable inside a macro).
- Avoid using the same variable for two different purposes. Use two
variables instead. Reusing a variable can result in subtle bugs and
makes optimization more difficult for the compiler (gcc's register
allocator still can't do live range splitting). It sometimes makes
debugging more difficult when a value of interest gets clobbered for
no good reason.
- When declaring a hidden variable inside a macro, place an underscore
both before and after the identifier name. This is to avoid clashes
with local variables in outer scopes, and also to make clear the
special nature of the variable. So for example, SQUARE() could
be written as this GNU-style macro:
#define SQUARE(a) ({ typeof (a) _x_ = (a); _x_ * _x_; })
- Never use the same variable to hold both big and little endian
values. For example, don't do:
return_by_reference (&foo); /* This is */
foo = CL (foo); /* bad */
instead, use a temp variable to hold the big endian value,
or create a helper function that returns the value in the
byte order you want.
- Use functions rather than macros when possible, at least when the
expression is non-trivial. Functions are far easier to debug and do
type checking that (simple) macros can't. There are of course
numerous exceptions to this rule; for example, if you want to
"return" alloca'd memory you must do it from within a macro.
- Generally avoid macros that modify variables where a function call
could not. When you must modify variables, pass the variables as
arguments to the macro so it's at least clear that they are being
used. For example:
a = MACRO(); /* Best */
MACRO(a); /* Only if you really need to */
MACRO(); /* (where `a' is secretly modified) Evil! */
- Use `int' for most loop counters, even if a short will suffice. Old
versions of gcc could generate better looping code for shorts on the
m68k, but shorts are worse on other architectures, and newer gcc's
handle int loop counters well on the m68k.
- In loops, don't use fancy pointer arithmetic instead of a simple
array index just because you think it might make your code faster.
Unlike the m68k, the x86 and most RISC chips don't have general
postincrement and predecrement addressing modes, so this complexity
is usually wasted and often counterproductive.
- Strive to avoid preprocessor conditionals, and instead move code to
configuration files. This is not always possible, but often it is.
Tangles of #if's are much harder to read and maintain than code
which knows that the host implements a well-defined API for
system-specific features.
- Use "#if defined (FOO)" rather than "#ifdef FOO". This makes
it easier to add more clauses to the conditional, e.g.
"#if 0 && defined (FOO)"
- When you do need to use preprocessor conditionals, try not to use
conditionals based on particular operating systems; instead, make
the #ifdef's conditional upon the particular feature of interest.
For example, test "defined (USE_STRUCT_DIRECT)", rather than
"defined (NEXTSTEP)" if you want to see whether to use struct direct
or struct dirent. There are reasonable exceptions to this rule
where some feature is so intimately tied to an operating system that
there's no reason to bother making a specific macro. An example
would be conditional compilation for an ObjC NEXTSTEP call; there's
no reason to waste your time bothering to come up with a specific
conditional that only gets defined for NEXTSTEP, because it will
never be used elsewhere. One rule of thumb is to avoid using two
conditionals, e.g.:
#if defined (NEXTSTEP) || defined (LINUX)
In this case you should create a new macro and define it in both the
NEXTSTEP and LINUX config headers.
- All internal Executor types will end in `_t', except for
u?int{8,16,32}. Each Macintosh types will have its canonical name.
- For Macintosh struct fields defined in the documentation, use
standard Mac data types (e.g. INTEGER, LONGINT, etc.)
- Use enums rather than #define's when possible, because gdb knows
about enums. This is not possible when those values are used in
preprocessor conditionals, but that is rare.
- When you want a boolean value, and you're not in a context where
you should use the Mac BOOLEAN type, use `boolean_t'. That's an
enum and gdb knows about enums.
- When you specifically want a signed or unsigned N-bit number and
you're not in a context where you should use a Mac type, use int8,
int16, int32, int64, uint8, uint16, uint32, and uint64. For
example, an array of bytes should be an array of `uint8', not an
array of `unsigned char'. Configuration files will guarantee that
those fundamental types will be correct for all architectures and
compilers. If you just want an integer with at least 32 bits, use
`int' or `unsigned'. As per the GNU coding standards, you may
assume that ints have at least 32 bits. It is, however, possible
they may have more bits someday, so use `int32' or `uint32' when you
want exactly 32 bits.
- When creating an enum or any collection of related constant values,
try to use a common naming prefix for the identifiers that groups
them all together. This is not necessary for those rare macros
whose lifetime spans only a few lines, but is useful for values in
header files.
- Do not assume that pointer types are 32 bits. This is false on the
DEC Alpha, where they are 64 bits. Cliff modified gcc slightly to
allow pointer bitfields to let us typedef Mac structs with the
proper size and semantics. In some situations you will need to
assume that only the lower 32 bits of the pointer are significant,
since we frequently store pointers in Mac data structures that only
allocate 32 bits for pointers. We can make this happen under OSF on
the Alpha by using the `-taso' linker flag.
- If you have an if...else clause, and one consequent is substantially
shorter than the other, adjust the sense of the `if' so the shorter
code sequence goes first and the longer code sequence goes in the
`else' clause.
- Don't hesitate to create new source files if that's the cleanest way
to add a new feature or split something up into manageable pieces.
Place those files in specific config directories if that's where
they belong, to avoid cluttering up the main source tree.
Why is this a good idea?
- It can be difficult to wade through the RCS diffs for enormous
source files to find a change of interest. The bigger the file,
the more changes take place. As of this writing, main.c is
nearly 2000 lines long and has gone through 177 revisions since
work on Executor 2 started.
- It is sometimes useful to "roll back" files to earlier versions,
to track down a regression. Gigantic files with numerous
unrelated changes are almost impossible to roll back without
breaking something else.
- The more interesting things are in a file, the more likely there
is to be resource contention for that file between developers.
This problem will only get worse as ARDI gets more programmers.
- Small files compile faster (although, to be fair, if a lot
of junk gets #included then multiple files can be slower)
- Having more .o files gives us more freedom to choose a link
ordering that minimizes our memory footprint (NEXTSTEP has tools
to do this, and we can make rough-hewn tools for other platforms
with gprof and Perl).
- When printing a hexadecimal number in a diagnostic, always preface
it with "0x". Don't make the person reading the diagnostic guess
whether "400" really means four hundred or whether it's one thousand
twenty four.
UNRESOLVED:
- Infinite loops, when necessary, should be written as:
while (1) /* or should that be while (TRUE)? */
{
}
- [need naming convention for variables beyond GNU standards, e.g.
..._be suffix for big endian? What about ROMlib_ prefix,
or suffixes indicating type information?]
- Predicate variables or functions which return boolean will be
annotated with "_p".
- Variables which are return-by-reference pointers are annotated
with "_ref".
- Variables which contain big endian values are annotated with "_be".
A big endian int would have this suffix. A byte-swapped pointer
to a struct would have it. A pointer to a byte-swapped struct
would *not* have it, because the pointer variable itself is
big endian. A big endian Point struct sitting on the stack would
have the "_be" annotation.
----------------------------------------------------------------------
----------------------------------------------------------------------
There are two classes of transformations to the source tree:
- Those that affect the resulting Executor binary
- Those that do not
Modifications which do not affect the resulting binary are easy to
verify for correctness: make the modifications, recompile, and then
make sure the binary is unchanged. Consequently, such modifications
are "safe" only if made independently of mods that change the Executor
binary. To be truly safe, Executor binaries for all supported
platforms should be recompiled and compared before moving on.
It is still possible, of course, that an error can creep in somewhere
and not get noticed. A simple case is conditionally compiled code
that doesn't get compiled during the binary comparison test, but there
are more complex cases.
Although convenient, it is not essential (or even wise) to do in one
pass all mods which shouldn't change the Executor binary. Doing so
may save verification runs, but will cause extra headaches when the
verification fails. Changing several things in one step also makes
RCS diffs confusing. That doesn't mean that it is never appropriate
to group two semi-related sets of changes together; use your judgment.
----------------------------------------------------------------------
Split source tree into subdirectories containing related groups of
files and utilities?
./
executor.make
ae/
AE.c
AE_coercion.c
AE_desc.c
AE_hdlr.c
print/
PSprint.c
PSstrings.c
prError.c
prInit.c
prLowLevel.c
prPrinting.c
prRecords.c
ctl/
ctlArrows.c
ctlDisplay.c
ctlIMIV.c
ctlInit.c
ctlMiniarrows.c
ctlMisc.c
ctlMouse.c
ctlPopup.c
ctlSet.c
ctlSize.c
ctlStddef.c
dial/ [dlog?]
dialAlert.c
dialCreate.c
dialDispatch.c
dialHandle.c
dialInit.c
dialItem.c
dialManip.c
file/
fileAccess.c
fileCreate.c
fileDirs.c
fileDouble.c
fileHighlevel.c
fileInfo.c
fileMisc.c
fileVolumes.c
diskinit.c
slash.c
disk.c
hfs/ [merged with file?]
hfsBtree.c
hfsChanging.c
hfsCreate.c
hfsFile.c
hfsHelper.c
hfsHier.c
hfsMisc.c
hfsVolume.c
hfsWorkingdir.c
hfsXbar.c
list/
listAccess.c
listAddDel.c
listCreate.c
listDisplay.c
listMouse.c
listOps.c
listStdLDEF.c
sane/
float4.c
float5.c
float7.c
floatnext.c
menu/
menu.c
menuColor.c
menuV.c
stdmbdf.c
stdmdef.c
res/
resGet.c
resGetinfo.c
resGettype.c
resIMIV.c
resInit.c
resMisc.c
resMod.c
resOpen.c
resPartial.c
resSetcur.c
qd/
default_ctab_values.c
hintemplate.h
makerawblt.pl
mkseedtables.c
mksspairtable.c
mkultable.c
qBit.c
qCConv.c
qCGrafPort.c
qCRegular.c
qColor.c
qColorMgr.c
qColorPicker.c
qColorutil.c
qCursor.c
qGDevice.c
qGWorld.c
qGrafport.c
qHooks.c
qIMIV.c
qIMV.c
qIMVI.c
qIMVxfer.c
qMisc.c
qPaletteMgr.c
qPen.c
qPicstuff.c
qPict2.c
qPicture.c
qPixMapConv.c
qPoint.c
qPoly.c
qRect.c
qRegion.c
qRegular.c
qScale.c
qStandard.c
qStdArc.c
qStdBits.c
qStdLine.c
qStdOval.c
qStdPic.c
qStdPoly.c
qStdRRect.c
qStdRect.c
qStdRgn.c
qStdText.c
qText.c
autorefresh.c
dirtyrect.c
dcconvert.c
dcmaketables.c
rawpatblt.c
rawsrcblt.c
refresh.c
rgbutil.c
pat-blitters.tmpl
src-blitters.tmpl
srcblt.c
xdata.c
xdblt.c
serial/
serial.c
te/
teAccess.c
teDisplay.c
teEdit.c
teIMIV.c
teIMV.c
teInit.c
teInsert.c
teMisc.c
teScrap.c
script/
script.c
mman/
mman.c
mmansubr.c
wind/
windColor.c
windDisplay.c
windDocdef.c
windInit.c
windMisc.c
windMouse.c
windSize.c
windUpdate.c
sound/
sound.c
soundIMVI.c
snth5.c
map/
map_to_c.c
active.map
apple.map
arrow_down_active.map
arrow_down_inactive.map
arrow_left_active.map
arrow_left_inactive.map
arrow_right_active.map
arrow_right_inactive.map
arrow_up_active.map
arrow_up_inactive.map
go_away.map
grow.map
ractive.map
thumb_horiz.map
thumb_vert.map
zoom.map
glue/
emustubs.c
emutrap.c
emutraptables.c
tooltrap.awk
tooltrap2.awk
event/
hle.c
osevent.c
toolevent.c
ibm_keycodes.c
debug/
error.c
dump.c
prefs/
option.c
parse.y
parsenum.c
parseopt.c
time/
time.c
vbl.c
syncint.c
config/arch/m68k
priv.c
OBSOLETE/
genctopflags_h.tmpl
genctopflags_h.tmpl
genfndecls.c
genptocflags_h.tmpl
genstubify_h.tmpl
genstubify_s.tmpl
globals.awk
globals.c
globals.pl
mkexpandtables.c
mktable2.awk
mktable3.awk
romlib_stubs.c
ostrap.awk
setuid.c
stubify.awk
stubs.s
think.c
trapinfo
WTF_IS_THIS?
genrand_h.c
Unresolved
--------------------
aboutbox.c
adb.c
alias.c
balloon.c
bindec.c
color_wheel_bits.c
config
crc.c
desk.c
device.c
edition.c
executor.c
finder.c
gensplash.c
gestalt.c
icon.c
image.c
image_inits.c
include
iu.c
iv-stubs.c
keycode.c
launch.c
license.c
licensetext.c
main.c
mkvol
notify.c
osutil.c
pack.c
process.c
protector.c
scrap.c
screen-dump.c
segment.c
shutdown.c
sigio_multiplex.c
splash
splash.c
stdfile.c
string.c
syserr.c
system_error.c
tempmem.c
toolmath.c
toolutil.c
uniquefile.c
version.c
vgavdriver.c