CAP/support/enet/enet.4
2015-03-18 22:05:00 +01:00

753 lines
20 KiB
Groff

.TH ENET 4 "17 January 1986" Stanford
.SH "NAME"
enet \- ethernet packet filter
.SH SYNOPSIS
.B "pseudo-device enetfilter 64"
.SH "DESCRIPTION"
The packet filter
provides a raw interface to Ethernets and similar network data link layers.
Packets received that are not used by the kernel
(i.e., to support IP, ARP, and on some systems XNS, protocols)
are available through this mechanism.
The packet filter appears as a set of character special files, one
per hardware interface.
Each
.I enet
file may be opened multiple times, allowing each interface to be used by
many processes.
The total number of open ethernet
files is limited to the value given in the kernel configuration; the
example given in the SYNOPSIS above sets the limit to 64.
.PP
The minor device numbers
are associated with interfaces when the system is booted.
Minor device 0
is associated with the first Ethernet interface ``attached'',
minor device 1 with the second, and so forth.
(These character special files are, for historical reasons,
given the names
.IR /dev/enet0 ,
.IR /dev/eneta0 ,
.IR /dev/enetb0 ,
etc.)
.PP
Associated with each open instance of an
.I enet
file is a user-settable packet filter which is used to deliver
incoming ethernet packets to the appropriate process.
Whenever a packet is received from the net,
successive packet filters from the list of filters for
all open
.I enet
files are applied to the packet.
When a filter accepts the packet,
it is placed on the packet input queue of the
associated file.
If no filters accept the packet, it is discarded.
The format of a packet filter is described below.
.PP
Reads from these files return the next packet
from a queue of packets that have matched the filter.
If insufficient buffer space to store the entire packet is specified in the
read, the packet will be truncated and the trailing contents lost.
Writes to these devices transmit packets on the
network, with each write generating exactly one packet.
.PP
The packet filter currently supports a variety of different ``Ethernet''
data-link levels:
.IP "3mb Ethernet" 1.5i
packets consist of 4 or more bytes with the first byte
specifying the source ethernet address, the second
byte specifying the destination ethernet address,
and the next two bytes specifying the packet type.
(Actually, on the network the source and destination addresses
are in the opposite order.)
.IP "byte-swapping 3mb Ethernet" 1.5i
packets consist of 4 or more bytes with the first byte
specifying the source ethernet address, the second
byte specifying the destination ethernet address,
and the next two bytes specifying the packet type.
\fBEach short word (pair of bytes) is swapped from the network
byte order\fR; this device type is only provided as a concession
to backwards-compatibility.
.IP "10mb Ethernet" 1.5i
packets consist of 14 or more bytes with the first six
bytes specifying the destination ethernet address,
the next six bytes the source ethernet address,
and the next two bytes specifying the packet type.
.PP
The remaining words are interpreted according to the packet type.
Note that 16-bit and 32-bit quantities may have to be byteswapped
(and possible short-swapped) to be intelligible on a Vax.
.PP
The packet filter mechanism does not know anything about the data portion
of the packets it sends and receives. The user must supply
the headers for transmitted packets (although the system makes sure that
the source address is correct) and the headers of received packets
are delivered to the user. The packet filters treat the entire packet,
including headers, as uninterpreted data.
.SH "IOCTL CALLS"
In addition to FIONREAD,
ten special
.I
ioctl
calls may be applied to an open
.I
enet
file.
The first two set and fetch parameters
for the file and are of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, code, param)
.B
struct eniocb *param;
.RE
.fi
.sp
where
.I
param
is defined in
.I
<sys/enet.h>
as:
.br
.sp
.nf
.RS
.ta \w'struct 'u +\w'u_char 'u
.ft B
struct eniocb
{
u_char en_addr;
u_char en_maxfilters;
u_char en_maxwaiting;
u_char en_maxpriority;
long en_rtout;
};
.DT
.RE
.fi
.ft R
.sp
with the applicable codes being:
.TP
EIOCGETP
Fetch the parameters for this file.
.TP
EIOCSETP
Set the parameters for this file.
.i0
.DT
.PP
The maximum filter length parameter en_maxfilters indicates
the maximum possible packet filter command
list length (see EIOCSETF below).
The maximum input wait queue size parameter en_maxwaitingindicates
the maximum number of packets which may be queued for an
ethernet file at one time (see EIOCSETW below).
The maximum priority parameter en_maxpriority indicates the highest
filter priority which may be set for the file (see EIOCSETF below).
The en_addr field is no longer maintained by the driver; see
EIOCDEVP below.
.PP
The read timeout parameter en_rtout specifies the number of clock ticks to
wait before timing out on a read request and returning an EOF.
This parameter is initialized to zero by
.IR open (2),
indicating no timeout. If it is negative, then read requests will return an
EOF immediately if there are no packets in the input queue.
(Note that all parameters except for the read timeout are read-only
and are ignored when changed.)
.PP
A different
.I ioctl
is used to get device parameters of the ethernet underlying the
minor device. It is of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.br
.B #include <sys/enet.h>
.B ioctl(fildes, EIOCDEVP, param)
.RE
.fi
.sp
where
.I param
is defined in
.I <sys/enet.h>
as:
.br
.sp
.nf
.RS
.ta \w'struct 'u +\w'u_short 'u
.ft B
struct endevp {
u_char end_dev_type;
u_char end_addr_len;
u_short end_hdr_len;
u_short end_MTU;
u_char end_addr[EN_MAX_ADDR_LEN];
u_char end_broadaddr[EN_MAX_ADDR_LEN];
};
.DT
.RE
.fi
.ft R
.sp
The fields are:
.IP end_dev_type 1.5i
Specifies the device type; currently one of ENDT_3MB, ENDT_BS3MB or ENDT_10MB.
.IP end_addr_len 1.5i
Specifies the address length in bytes (e.g., 1 or 6).
.IP end_hdr_len 1.5i
Specifies the total header length in bytes (e.g., 4 or 14).
.IP end_MTU 1.5i
Specifies the maximum packet size, including header, in bytes.
.IP end_addr 1.5i
The address of this interface; aligned so that the low order
byte of the address is the first byte in the array.
.IP end_broadaddr 1.5i
The hardware destination address for broadcasts on this network.
.PP
The next two calls enable and disable the input
packet signal mechanism
for the file and are of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, code, signp)
.B
u_int *signp;
.RE
.fi
.sp
where
.I
signp
is a pointer to a word containing the number
of the signal
to be sent when an input packet arrives and
with the applicable codes being:
.TP
EIOCENBS
Enable the specified signal when an input packet
is received for this file.
If the ENHOLDSIG flag (see EIOCMBIS below) is not set,
further signals are automatically disabled
whenever a signal is sent to prevent nesting and hence
must be specifically re-enabled after processing.
When a signal number of 0 is supplied,
this call is equivalent to EIOCINHS.
.TP
EIOCINHS
Disable any signal when an input
packet is received for this file
(the
.I signp
parameter is ignored).
This is the default when the file is first opened.
.i0
.DT
.PP
.sp
The next two calls set and clear ``mode bits'' for the
for the file and are of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, code, bits)
.B
u_short *bits;
.RE
.fi
.sp
where
.I bits
is a short work bit-mask specifying which bits to set or clear.
Currently, the only bit mask recognized is ENHOLDSIG, which (if
.IR clear )
means that the driver should
disable the effect of EIOCENBS once it has delivered a signal.
Setting this bit
means that you need use EIOCENBS only once. (For historical reasons, the
default is that ENHOLDSIG is set.)
The applicable codes are:
.TP
EIOCMBIS
Sets the specified mode bits
.TP
EIOCMBIC
Clears the specified mode bits
.PP
Another
.I
ioctl
call is used to set the maximum size
of the packet input queue for
an open
.I
enet
file.
It is of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, EIOCSETW, maxwaitingp)
.B u_int *maxwaitingp;
.RE
.fi
.sp
where
.I
maxwaitingp
is a pointer
to a word containing
the input queue size to be set.
If this is greater than maximum allowable
size (see EIOCGETP above), it is set to the maximum,
and if it is zero, it is set to
a default value.
.sp
Another
.I
ioctl
call flushes the queue of incoming packets.
It is of the
form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, EIOCFLUSH, 0)
.RE
.fi
.sp
The final generic
.I
ioctl
call is used to set the packet filter
for an open
.I
enet
file.
It is of the form:
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, EIOCSETF, filter)
.B struct enfilter *filter
.RE
.fi
.sp
where enfilter is defined in
.I
<sys/enet.h>
as:
.sp
.nf
.RS
.ft B
.ta \w'struct 'u \w'struct u_short 'u
struct enfilter
{
u_char enf_Priority;
u_char enf_FilterLen;
u_short enf_Filter[ENMAXFILTERS];
};
.DT
.ft R
.RE
.fi
.PP
A packet filter consists of a priority,
the filter command list length (in shortwords),
and the filter command list itself.
Each filter command list specifies
a sequence of actions which
operate on an internal stack.
Each shortword of the
command list specifies an action from the set {
.B
ENF_PUSHLIT,
.B
ENF_PUSHZERO,
.B
ENF_PUSHWORD+N
} which respectively push the next shortword of the
command list, zero,
or shortword
.B
N
of the incoming packet on the stack, and a binary operator
from the set {
.B
ENF_EQ,
.B
ENF_NEQ,
.B
ENF_LT,
.B
ENF_LE,
.B
ENF_GT,
.B
ENF_GE,
.B
ENF_AND,
.B
ENF_OR,
.B
ENF_XOR
}
which then operates on the
top two elements of the stack and replaces them with its result.
When both an action and operator are specified in the
same shortword, the action is performed followed by the
operation.
.PP
The binary operator can also be from the set {
.B
ENF_COR,
.B
ENF_CAND,
.B
ENF_CNOR,
.B
ENF_CNAND
}. These are ``short-circuit'' operators, in that they terminate
the execution of the filter immediately if the condition they are checking
for is found, and continue otherwise.
All pop two elements from the stack and compare them for equality;
.B
ENF_CAND
returns false if the result is false;
.B
ENF_COR
returns true if the result is true;
.B
ENF_CNAND
returns true if the result is false;
.B
ENF_CNOR
returns false if the result is true.
Unlike the other binary operators, these four do not leave a result
on the stack, even if they continue.
.PP
The short-circuit operators should be used when possible, to reduce the
amount of time spent evaluating filters. When they are used, you should
also arrange the order of the tests so that the filter will succeed or fail
as soon as possible; for example, checking the Socket field of a Pup packet
is more likely to indicate failure than the packet type field.
.PP
The
special action
.B
ENF_NOPUSH
and the special operator
.B
ENF_NOP
can be used to only
perform the binary operation or to only push a value on the stack.
Since both are (conveniently) defined to be zero, indicating
only an action actually specifies the action followed by
.BR ENF_NOP ,
and
indicating only an operation actually specifies
.B
ENF_NOPUSH
followed
by the operation.
.PP
After executing the filter command list, a non-zero value (true)
left on top of the stack
(or an empty stack) causes the incoming
packet to be accepted for the corresponding
.I
enet
file and a zero value (false) causes the packet to
be passed through the next packet filter.
(If the filter exits as the result of a short-circuit operator,
the top-of-stack value is ignored.)
Specifying an undefined operation or action in the command list
or performing an illegal operation or action (such as pushing
a shortword offset
past the end of the packet or executing a binary operator
with fewer than two shortwords on the stack) causes a filter to
reject the packet.
.sp
In an attempt to deal with the problem of
overlapping and/or conflicting packet filters,
the filters for each open
.I
enet
file are ordered by the driver
according to their priority
(lowest
priority is 0, highest is 255).
When processing incoming
ethernet
packets, filters are applied according to their
priority (from highest to lowest) and
for identical priority values according to their
relative ``busyness'' (the filter that has previously
matched the most packets is checked first) until one or more filters
accept the packet or all filters reject it and
it is discarded.
.PP
Filters at a priority of 2 or higher are called "high priority"
filters.
Once a packet is delivered to one of these "high priority"
.I
enet
files,
no further filters are examined,
i.e.
the packet is delivered only
to the first
.I
enet
file with a
"high priority" filter
which accepts the packet.
A packet may be delivered to more than one filter with a priority
below 2; this might be useful, for example, in building replicated programs.
However, the use of low-priority filters imposes an additional cost on
the system, as these filters each must be checked against all packets not
accepted by a high-priority filter.
.PP
The packet filter for an
.I
enet
file is initialized
with length 0 at priority 0 by
.IR open (2),
and hence by default accepts all
packets which no "high priority" filter
is interested in.
.PP
Priorities should be assigned so that, in general, the more packets a
filter is expected to match, the higher its priority. This will prevent
a lot of needless checking of packets against filters that aren't likely
to match them.
.i0
.DT
.PP
.sp
There is a special
.I
ioctl
for use on the Sun. Because of the way the packet filter is interfaced
to SunOS, it is necessary to specify what Ethernet type codes are to
be handled by the packet filter. This is done using the
.TP
EIOCETHERT
call.
.sp
.nf
.RS
.B #include <sys/types.h>
.B #include <sys/enet.h>
.B ioctl(fildes, EIOCETHERT, ethert)
.B int *ethert
.RE
.fi
.sp
This will permanently (i.e. until the next reboot) place packets with
the specified type code under control of the packet filter mechanism.
If more than one type code is under control of the packet filter, you
must use filters to make sure that the right packet types are handed
to the right file descriptors. This ioctl affects the entire system.
That is, access to packets of the specified packet type is not limited
to the specific file descriptor (or even process) for which it is done
.sp
If the specified type code is already under control of the packet filter,
the error EEXIST will be returned. Rather than trying to figure out
whether some program has already run that specifies a given packet type,
it is probably best just to ignore the EEXIST error.
.SH "FILTER EXAMPLES"
The following filter would accept all incoming
.I Pup
packets on a 3mb ethernet with Pup types in the range 1-0100:
.sp
.nf
.ft B
.ta \w'stru'u \w'struct ENF_PUSHWORD+8, ENF_PUSHLIT, 2, 'u
struct enfilter f =
{
10, 19, /* priority and length */
ENF_PUSHWORD+1, ENF_PUSHLIT, 2,
ENF_EQ, /* packet type == PUP */
ENF_PUSHWORD+3, ENF_PUSHLIT,
0xFF00, ENF_AND, /* mask high byte */
ENF_PUSHZERO, ENF_GT, /* PupType > 0 */
ENF_PUSHWORD+3, ENF_PUSHLIT,
0xFF00, ENF_AND, /* mask high byte */
ENF_PUSHLIT, 0100, ENF_LE, /* PupType <= 0100 */
ENF_AND, /* 0 < PupType <= 0100 */
ENF_AND /* && packet type == PUP */
};
.DT
.ft R
.fi
.sp
Note that shortwords, such as the packet type field, are byte-swapped
and so the literals you compare them to must be byte-swapped. Also,
although for this example the word offsets are constants, code that
must run with either 3mb or 10mb ethernets must use
offsets that depend on the device type.
.PP
By taking advantage of the ability to
specify both an action and operation in each word of
the command list, the filter could be abbreviated to:
.sp
.nf
.ta \w'stru'u \w'struct ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND, 'u
.ft B
struct enfilter f =
{
10, 14, /* priority and length */
ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_EQ, 2, /* packet type == PUP */
ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
0xFF00, /* mask high byte */
ENF_PUSHZERO | ENF_GT, /* PupType > 0 */
ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_AND,
0xFF00, /* mask high byte */
ENF_PUSHLIT | ENF_LE, 0100, /* PupType <= 0100 */
ENF_AND, /* 0 < PupType <= 0100 */
ENF_AND /* && packet type == PUP */
};
.ft R
.DT
.fi
.sp
A different example shows the use of "short-circuit" operators to
create a more efficient filter. This one accepts Pup packets (on a 3Mbit
ethernet) with a Socket field of 12345. Note that we check the Socket field
before the packet type field, since in most
packets the Socket is not likely to match.
.sp
.nf
.ta \w'stru'u \w'struct ENF_PUSHWORD+3, ENF_PUSHLIT | ENF_CAND, 'u
.ft B
struct enfilter f =
{
10, 9, /* priority and length */
ENF_PUSHWORD+7, ENF_PUSHLIT | ENF_CAND,
0, /* High word of socket */
ENF_PUSHWORD+8, ENF_PUSHLIT | ENF_CAND,
12345, /* Low word of socket */
ENF_PUSHWORD+1, ENF_PUSHLIT | ENF_CAND,
2 /* packet type == Pup */
};
.ft R
.DT
.fi
.SH "PROGRAMMING NOTES"
In order to get a working program, you don't need most of the ioctl's.
Generally it's good enough simply to open /dev/enet, then to do
IOCSETF to establish a packet filter that specifies what packets you
want to see, and if you are on a Sun, IOCETHERT to make sure that the
packet filter sees the packet type involved. It does not matter which
order you do IOCSETF and IOCETHERT. IOCETHERT only needs to be done
once for each packet type. If you open several descriptors to process
packets of the same packet type, IOCETHERT only needs to be done once.
However it's probably more convenient to do it for each file
descriptor. You'll get EEXIST most of the time, but that is harmless.
There is a good chance that you'll want to do IOCSETW, to set the
number of packets that can be queued for input at one time. The
default is 2. This may be a bit low for some applications.
In order to get good performance, you will probably want to make sure
that all of your applications use the same priority in their filters.
The driver will automatically reorder the list so that often-used
filters migrate to the front of the list, but it will only sort
filters with the same priority. Assuming that you are using a "high
priority" (which is currently defined as any priority above 1), the
code runs all filters in turn until it finds one that succeeds. If
you have a number of filters it is very important to make sure they
are tried in the right order. You can use etherstat(8) to look at all
the filters and make sure that the order is sensible.
.SH "SEE ALSO"
de(4), ec(4), en(4), il(4), enstat(8)
.SH "FILES"
/dev/enet{,a,b,c,...}0
.SH BUGS
The current implementation can only filter on words within
the first "mbuf" of the packet; this is around 100 bytes (or
50 words).
.PP
Because packets are streams of bytes, yet the filters operate
on short words, and standard network byte order is usually opposite
from Vax byte order, the relational operators
.B ENF_LT, ENF_LE,
.B ENF_GT,
and
.B ENF_GE
are not all that useful. Fortunately, they were not often used
when the packets were treated as streams of shorts, so this is
probably not a severe problem. If this becomes a severe problem,
a byte-swapping operator could be added.
.PP
Many of the "features" of this driver are there for historical
reasons; the manual page could be a lot cleaner if these were
left out.
.PP
It is not at all clear that the algorithm for reordering the
filters make sense. It uses the count of the number of times
the filter has been used, and only moves a filter up if its
count is 100 more than the one above it. This is claimed to
prevent thrashing. Since moving a filter is fairly low in
cost compared to running a filter, it would probably
make more sense to move a filter up whenever it is used. Then
the list would rearrange itself with the filters currently being
used near the top.
.SH "HISTORY"
.TP
8-Oct-85 Jeffrey Mogul at Stanford University
Revised to describe 4.3BSD version of driver.
.TP
18-Oct-84 Jeffrey Mogul at Stanford University
Added short-circuit operators, changed discussion of priorities to
reflect new arrangement.
.TP
18-Jan-84 Jeffrey Mogul at Stanford University
Updated for 4.2BSD (device-independent) version, including
documentation of all non-kernel ioctls.
.TP
17-Nov-81 Mike Accetta (mja) at Carnegie-Mellon University
Added mention of <sys/types.h> to include examples.
.TP
29-Sep-81 Mike Accetta (mja) at Carnegie-Mellon University
Changed to describe new EIOCSETW and EIOCFLUSH ioctl
calls and the new multiple packet queuing features.
.TP
12-Nov-80 Mike Accetta (mja) at Carnegie-Mellon University
Added description of signal mechanism for input packets.
.TP
07-Mar-80 Mike Accetta (mja) at Carnegie-Mellon University
Created.