mirror of
https://github.com/sheumann/hush.git
synced 2025-01-10 16:29:44 +00:00
7449e18190
Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
480 lines
17 KiB
HTML
480 lines
17 KiB
HTML
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
|
|
<html><head>
|
|
<!-- saved from http://www.win.tue.nl/~aeb/linux/lk/lk-10.html -->
|
|
<meta name="GENERATOR" content="SGML-Tools 1.0.9"><title>The Linux kernel: Processes</title>
|
|
</head>
|
|
<body>
|
|
<hr>
|
|
<h2><a name="s10">10. Processes</a></h2>
|
|
|
|
<p>Before looking at the Linux implementation, first a general Unix
|
|
description of threads, processes, process groups and sessions.
|
|
</p><p>
|
|
(See also <a href="http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap11.html">General Terminal Interface</a>)
|
|
</p><p>A session contains a number of process groups, and a process group
|
|
contains a number of processes, and a process contains a number
|
|
of threads.
|
|
</p><p>A session can have a controlling tty.
|
|
At most one process group in a session can be a foreground process group.
|
|
An interrupt character typed on a tty ("Teletype", i.e., terminal)
|
|
causes a signal to be sent to all members of the foreground process group
|
|
in the session (if any) that has that tty as controlling tty.
|
|
</p><p>All these objects have numbers, and we have thread IDs, process IDs,
|
|
process group IDs and session IDs.
|
|
</p><p>
|
|
</p><h2><a name="ss10.1">10.1 Processes</a>
|
|
</h2>
|
|
|
|
<p>
|
|
</p><h3>Creation</h3>
|
|
|
|
<p>A new process is traditionally started using the <code>fork()</code>
|
|
system call:
|
|
</p><blockquote>
|
|
<pre>pid_t p;
|
|
|
|
p = fork();
|
|
if (p == (pid_t) -1)
|
|
/* ERROR */
|
|
else if (p == 0)
|
|
/* CHILD */
|
|
else
|
|
/* PARENT */
|
|
</pre>
|
|
</blockquote>
|
|
<p>This creates a child as a duplicate of its parent.
|
|
Parent and child are identical in almost all respects.
|
|
In the code they are distinguished by the fact that the parent
|
|
learns the process ID of its child, while <code>fork()</code>
|
|
returns 0 in the child. (It can find the process ID of its
|
|
parent using the <code>getppid()</code> system call.)
|
|
</p><p>
|
|
</p><h3>Termination</h3>
|
|
|
|
<p>Normal termination is when the process does
|
|
</p><blockquote>
|
|
<pre>exit(n);
|
|
</pre>
|
|
</blockquote>
|
|
|
|
or
|
|
<blockquote>
|
|
<pre>return n;
|
|
</pre>
|
|
</blockquote>
|
|
|
|
from its <code>main()</code> procedure. It returns the single byte <code>n</code>
|
|
to its parent.
|
|
<p>Abnormal termination is usually caused by a signal.
|
|
</p><p>
|
|
</p><h3>Collecting the exit code. Zombies</h3>
|
|
|
|
<p>The parent does
|
|
</p><blockquote>
|
|
<pre>pid_t p;
|
|
int status;
|
|
|
|
p = wait(&status);
|
|
</pre>
|
|
</blockquote>
|
|
|
|
and collects two bytes:
|
|
<p>
|
|
<figure>
|
|
<eps file="absent">
|
|
<img src="ctty_files/exit_status.png">
|
|
</eps>
|
|
</figure></p><p>A process that has terminated but has not yet been waited for
|
|
is a <i>zombie</i>. It need only store these two bytes:
|
|
exit code and reason for termination.
|
|
</p><p>On the other hand, if the parent dies first, <code>init</code> (process 1)
|
|
inherits the child and becomes its parent.
|
|
</p><p>
|
|
</p><h3>Signals</h3>
|
|
|
|
<p>
|
|
</p><h3>Stopping</h3>
|
|
|
|
<p>Some signals cause a process to stop:
|
|
<code>SIGSTOP</code> (stop!),
|
|
<code>SIGTSTP</code> (stop from tty: probably ^Z was typed),
|
|
<code>SIGTTIN</code> (tty input asked by background process),
|
|
<code>SIGTTOU</code> (tty output sent by background process, and this was
|
|
disallowed by <code>stty tostop</code>).
|
|
</p><p>Apart from ^Z there also is ^Y. The former stops the process
|
|
when it is typed, the latter stops it when it is read.
|
|
</p><p>Signals generated by typing the corresponding character on some tty
|
|
are sent to all processes that are in the foreground process group
|
|
of the session that has that tty as controlling tty. (Details below.)
|
|
</p><p>If a process is being traced, every signal will stop it.
|
|
</p><p>
|
|
</p><h3>Continuing</h3>
|
|
|
|
<p><code>SIGCONT</code>: continue a stopped process.
|
|
</p><p>
|
|
</p><h3>Terminating</h3>
|
|
|
|
<p><code>SIGKILL</code> (die! now!),
|
|
<code>SIGTERM</code> (please, go away),
|
|
<code>SIGHUP</code> (modem hangup),
|
|
<code>SIGINT</code> (^C),
|
|
<code>SIGQUIT</code> (^\), etc.
|
|
Many signals have as default action to kill the target.
|
|
(Sometimes with an additional core dump, when such is
|
|
allowed by rlimit.)
|
|
The signals <code>SIGCHLD</code> and <code>SIGWINCH</code>
|
|
are ignored by default.
|
|
All except <code>SIGKILL</code> and <code>SIGSTOP</code> can be
|
|
caught or ignored or blocked.
|
|
For details, see <code>signal(7)</code>.
|
|
</p><p>
|
|
</p><h2><a name="ss10.2">10.2 Process groups</a>
|
|
</h2>
|
|
|
|
<p>Every process is member of a unique <i>process group</i>,
|
|
identified by its <i>process group ID</i>.
|
|
(When the process is created, it becomes a member of the process group
|
|
of its parent.)
|
|
By convention, the process group ID of a process group
|
|
equals the process ID of the first member of the process group,
|
|
called the <i>process group leader</i>.
|
|
A process finds the ID of its process group using the system call
|
|
<code>getpgrp()</code>, or, equivalently, <code>getpgid(0)</code>.
|
|
One finds the process group ID of process <code>p</code> using
|
|
<code>getpgid(p)</code>.
|
|
</p><p>One may use the command <code>ps j</code> to see PPID (parent process ID),
|
|
PID (process ID), PGID (process group ID) and SID (session ID)
|
|
of processes. With a shell that does not know about job control,
|
|
like <code>ash</code>, each of its children will be in the same session
|
|
and have the same process group as the shell. With a shell that knows
|
|
about job control, like <code>bash</code>, the processes of one pipeline, like
|
|
</p><blockquote>
|
|
<pre>% cat paper | ideal | pic | tbl | eqn | ditroff > out
|
|
</pre>
|
|
</blockquote>
|
|
|
|
form a single process group.
|
|
<p>
|
|
</p><h3>Creation</h3>
|
|
|
|
<p>A process <code>pid</code> is put into the process group <code>pgid</code> by
|
|
</p><blockquote>
|
|
<pre>setpgid(pid, pgid);
|
|
</pre>
|
|
</blockquote>
|
|
|
|
If <code>pgid == pid</code> or <code>pgid == 0</code> then this creates
|
|
a new process group with process group leader <code>pid</code>.
|
|
Otherwise, this puts <code>pid</code> into the already existing
|
|
process group <code>pgid</code>.
|
|
A zero <code>pid</code> refers to the current process.
|
|
The call <code>setpgrp()</code> is equivalent to <code>setpgid(0,0)</code>.
|
|
<p>
|
|
</p><h3>Restrictions on setpgid()</h3>
|
|
|
|
<p>The calling process must be <code>pid</code> itself, or its parent,
|
|
and the parent can only do this before <code>pid</code> has done
|
|
<code>exec()</code>, and only when both belong to the same session.
|
|
It is an error if process <code>pid</code> is a session leader
|
|
(and this call would change its <code>pgid</code>).
|
|
</p><p>
|
|
</p><h3>Typical sequence</h3>
|
|
|
|
<p>
|
|
</p><blockquote>
|
|
<pre>p = fork();
|
|
if (p == (pid_t) -1) {
|
|
/* ERROR */
|
|
} else if (p == 0) { /* CHILD */
|
|
setpgid(0, pgid);
|
|
...
|
|
} else { /* PARENT */
|
|
setpgid(p, pgid);
|
|
...
|
|
}
|
|
</pre>
|
|
</blockquote>
|
|
|
|
This ensures that regardless of whether parent or child is scheduled
|
|
first, the process group setting is as expected by both.
|
|
<p>
|
|
</p><h3>Signalling and waiting</h3>
|
|
|
|
<p>One can signal all members of a process group:
|
|
</p><blockquote>
|
|
<pre>killpg(pgrp, sig);
|
|
</pre>
|
|
</blockquote>
|
|
<p>One can wait for children in ones own process group:
|
|
</p><blockquote>
|
|
<pre>waitpid(0, &status, ...);
|
|
</pre>
|
|
</blockquote>
|
|
|
|
or in a specified process group:
|
|
<blockquote>
|
|
<pre>waitpid(-pgrp, &status, ...);
|
|
</pre>
|
|
</blockquote>
|
|
<p>
|
|
</p><h3>Foreground process group</h3>
|
|
|
|
<p>Among the process groups in a session at most one can be
|
|
the <i>foreground process group</i> of that session.
|
|
The tty input and tty signals (signals generated by ^C, ^Z, etc.)
|
|
go to processes in this foreground process group.
|
|
</p><p>A process can determine the foreground process group in its session
|
|
using <code>tcgetpgrp(fd)</code>, where <code>fd</code> refers to its
|
|
controlling tty. If there is none, this returns a random value
|
|
larger than 1 that is not a process group ID.
|
|
</p><p>A process can set the foreground process group in its session
|
|
using <code>tcsetpgrp(fd,pgrp)</code>, where <code>fd</code> refers to its
|
|
controlling tty, and <code>pgrp</code> is a process group in
|
|
its session, and this session still is associated to the controlling
|
|
tty of the calling process.
|
|
</p><p>How does one get <code>fd</code>? By definition, <code>/dev/tty</code>
|
|
refers to the controlling tty, entirely independent of redirects
|
|
of standard input and output. (There is also the function
|
|
<code>ctermid()</code> to get the name of the controlling terminal.
|
|
On a POSIX standard system it will return <code>/dev/tty</code>.)
|
|
Opening the name of the
|
|
controlling tty gives a file descriptor <code>fd</code>.
|
|
</p><p>
|
|
</p><h3>Background process groups</h3>
|
|
|
|
<p>All process groups in a session that are not foreground
|
|
process group are <i>background process groups</i>.
|
|
Since the user at the keyboard is interacting with foreground
|
|
processes, background processes should stay away from it.
|
|
When a background process reads from the terminal it gets
|
|
a SIGTTIN signal. Normally, that will stop it, the job control shell
|
|
notices and tells the user, who can say <code>fg</code> to continue
|
|
this background process as a foreground process, and then this
|
|
process can read from the terminal. But if the background process
|
|
ignores or blocks the SIGTTIN signal, or if its process group
|
|
is orphaned (see below), then the read() returns an EIO error,
|
|
and no signal is sent. (Indeed, the idea is to tell the process
|
|
that reading from the terminal is not allowed right now.
|
|
If it wouldn't see the signal, then it will see the error return.)
|
|
</p><p>When a background process writes to the terminal, it may get
|
|
a SIGTTOU signal. May: namely, when the flag that this must happen
|
|
is set (it is off by default). One can set the flag by
|
|
</p><blockquote>
|
|
<pre>% stty tostop
|
|
</pre>
|
|
</blockquote>
|
|
|
|
and clear it again by
|
|
<blockquote>
|
|
<pre>% stty -tostop
|
|
</pre>
|
|
</blockquote>
|
|
|
|
and inspect it by
|
|
<blockquote>
|
|
<pre>% stty -a
|
|
</pre>
|
|
</blockquote>
|
|
|
|
Again, if TOSTOP is set but the background process ignores or blocks
|
|
the SIGTTOU signal, or if its process group is orphaned (see below),
|
|
then the write() returns an EIO error, and no signal is sent.
|
|
[vda: correction. SUS says that if SIGTTOU is blocked/ignored, write succeeds. ]
|
|
<p>
|
|
</p><h3>Orphaned process groups</h3>
|
|
|
|
<p>The process group leader is the first member of the process group.
|
|
It may terminate before the others, and then the process group is
|
|
without leader.
|
|
</p><p>A process group is called <i>orphaned</i> when <i>the
|
|
parent of every member is either in the process group
|
|
or outside the session</i>.
|
|
In particular, the process group of the session leader
|
|
is always orphaned.
|
|
</p><p>If termination of a process causes a process group to become
|
|
orphaned, and some member is stopped, then all are sent first SIGHUP
|
|
and then SIGCONT.
|
|
</p><p>The idea is that perhaps the parent of the process group leader
|
|
is a job control shell. (In the same session but a different
|
|
process group.) As long as this parent is alive, it can
|
|
handle the stopping and starting of members in the process group.
|
|
When it dies, there may be nobody to continue stopped processes.
|
|
Therefore, these stopped processes are sent SIGHUP, so that they
|
|
die unless they catch or ignore it, and then SIGCONT to continue them.
|
|
</p><p>Note that the process group of the session leader is already
|
|
orphaned, so no signals are sent when the session leader dies.
|
|
</p><p>Note also that a process group can become orphaned in two ways
|
|
by termination of a process: either it was a parent and not itself
|
|
in the process group, or it was the last element of the process group
|
|
with a parent outside but in the same session.
|
|
Furthermore, that a process group can become orphaned
|
|
other than by termination of a process, namely when some
|
|
member is moved to a different process group.
|
|
</p><p>
|
|
</p><h2><a name="ss10.3">10.3 Sessions</a>
|
|
</h2>
|
|
|
|
<p>Every process group is in a unique <i>session</i>.
|
|
(When the process is created, it becomes a member of the session
|
|
of its parent.)
|
|
By convention, the session ID of a session
|
|
equals the process ID of the first member of the session,
|
|
called the <i>session leader</i>.
|
|
A process finds the ID of its session using the system call
|
|
<code>getsid()</code>.
|
|
</p><p>Every session may have a <i>controlling tty</i>,
|
|
that then also is called the controlling tty of each of
|
|
its member processes.
|
|
A file descriptor for the controlling tty is obtained by
|
|
opening <code>/dev/tty</code>. (And when that fails, there was no
|
|
controlling tty.) Given a file descriptor for the controlling tty,
|
|
one may obtain the SID using <code>tcgetsid(fd)</code>.
|
|
</p><p>A session is often set up by a login process. The terminal
|
|
on which one is logged in then becomes the controlling tty
|
|
of the session. All processes that are descendants of the
|
|
login process will in general be members of the session.
|
|
</p><p>
|
|
</p><h3>Creation</h3>
|
|
|
|
<p>A new session is created by
|
|
</p><blockquote>
|
|
<pre>pid = setsid();
|
|
</pre>
|
|
</blockquote>
|
|
|
|
This is allowed only when the current process is not a process group leader.
|
|
In order to be sure of that we fork first:
|
|
<blockquote>
|
|
<pre>p = fork();
|
|
if (p) exit(0);
|
|
pid = setsid();
|
|
</pre>
|
|
</blockquote>
|
|
|
|
The result is that the current process (with process ID <code>pid</code>)
|
|
becomes session leader of a new session with session ID <code>pid</code>.
|
|
Moreover, it becomes process group leader of a new process group.
|
|
Both session and process group contain only the single process <code>pid</code>.
|
|
Furthermore, this process has no controlling tty.
|
|
<p>The restriction that the current process must not be a process group leader
|
|
is needed: otherwise its PID serves as PGID of some existing process group
|
|
and cannot be used as the PGID of a new process group.
|
|
</p><p>
|
|
</p><h3>Getting a controlling tty</h3>
|
|
|
|
<p>How does one get a controlling terminal? Nobody knows,
|
|
this is a great mystery.
|
|
</p><p>The System V approach is that the first tty opened by the process
|
|
becomes its controlling tty.
|
|
</p><p>The BSD approach is that one has to explicitly call
|
|
</p><blockquote>
|
|
<pre>ioctl(fd, TIOCSCTTY, 0/1);
|
|
</pre>
|
|
</blockquote>
|
|
|
|
to get a controlling tty.
|
|
<p>Linux tries to be compatible with both, as always, and this
|
|
results in a very obscure complex of conditions. Roughly:
|
|
</p><p>The <code>TIOCSCTTY</code> ioctl will give us a controlling tty,
|
|
provided that (i) the current process is a session leader,
|
|
and (ii) it does not yet have a controlling tty, and
|
|
(iii) maybe the tty should not already control some other session;
|
|
if it does it is an error if we aren't root, or we steal the tty
|
|
if we are all-powerful.
|
|
[vda: correction: third parameter controls this: if 1, we steal tty from
|
|
any such session, if 0, we don't steal]
|
|
</p><p>Opening some terminal will give us a controlling tty,
|
|
provided that (i) the current process is a session leader, and
|
|
(ii) it does not yet have a controlling tty, and
|
|
(iii) the tty does not already control some other session, and
|
|
(iv) the open did not have the <code>O_NOCTTY</code> flag, and
|
|
(v) the tty is not the foreground VT, and
|
|
(vi) the tty is not the console, and
|
|
(vii) maybe the tty should not be master or slave pty.
|
|
</p><p>
|
|
</p><h3>Getting rid of a controlling tty</h3>
|
|
|
|
<p>If a process wants to continue as a daemon, it must detach itself
|
|
from its controlling tty. Above we saw that <code>setsid()</code>
|
|
will remove the controlling tty. Also the ioctl TIOCNOTTY does this.
|
|
Moreover, in order not to get a controlling tty again as soon as it
|
|
opens a tty, the process has to fork once more, to assure that it
|
|
is not a session leader. Typical code fragment:
|
|
</p><p>
|
|
</p><pre> if ((fork()) != 0)
|
|
exit(0);
|
|
setsid();
|
|
if ((fork()) != 0)
|
|
exit(0);
|
|
</pre>
|
|
<p>See also <code>daemon(3)</code>.
|
|
</p><p>
|
|
</p><h3>Disconnect</h3>
|
|
|
|
<p>If the terminal goes away by modem hangup, and the line was not local,
|
|
then a SIGHUP is sent to the session leader.
|
|
Any further reads from the gone terminal return EOF.
|
|
(Or possibly -1 with <code>errno</code> set to EIO.)
|
|
</p><p>If the terminal is the slave side of a pseudotty, and the master side
|
|
is closed (for the last time), then a SIGHUP is sent to the foreground
|
|
process group of the slave side.
|
|
</p><p>When the session leader dies, a SIGHUP is sent to all processes
|
|
in the foreground process group. Moreover, the terminal stops being
|
|
the controlling terminal of this session (so that it can become
|
|
the controlling terminal of another session).
|
|
</p><p>Thus, if the terminal goes away and the session leader is
|
|
a job control shell, then it can handle things for its descendants,
|
|
e.g. by sending them again a SIGHUP.
|
|
If on the other hand the session leader is an innocent process
|
|
that does not catch SIGHUP, it will die, and all foreground processes
|
|
get a SIGHUP.
|
|
</p><p>
|
|
</p><h2><a name="ss10.4">10.4 Threads</a>
|
|
</h2>
|
|
|
|
<p>A process can have several threads. New threads (with the same PID
|
|
as the parent thread) are started using the <code>clone</code> system
|
|
call using the <code>CLONE_THREAD</code> flag. Threads are distinguished
|
|
by a <i>thread ID</i> (TID). An ordinary process has a single thread
|
|
with TID equal to PID. The system call <code>gettid()</code> returns the
|
|
TID. The system call <code>tkill()</code> sends a signal to a single thread.
|
|
</p><p>Example: a process with two threads. Both only print PID and TID and exit.
|
|
(Linux 2.4.19 or later.)
|
|
</p><pre>% cat << EOF > gettid-demo.c
|
|
#include <unistd.h>
|
|
#include <sys/types.h>
|
|
#define CLONE_SIGHAND 0x00000800
|
|
#define CLONE_THREAD 0x00010000
|
|
#include <linux/unistd.h>
|
|
#include <errno.h>
|
|
_syscall0(pid_t,gettid)
|
|
|
|
int thread(void *p) {
|
|
printf("thread: %d %d\n", gettid(), getpid());
|
|
}
|
|
|
|
main() {
|
|
unsigned char stack[4096];
|
|
int i;
|
|
|
|
i = clone(thread, stack+2048, CLONE_THREAD | CLONE_SIGHAND, NULL);
|
|
if (i == -1)
|
|
perror("clone");
|
|
else
|
|
printf("clone returns %d\n", i);
|
|
printf("parent: %d %d\n", gettid(), getpid());
|
|
}
|
|
EOF
|
|
% cc -o gettid-demo gettid-demo.c
|
|
% ./gettid-demo
|
|
clone returns 21826
|
|
parent: 21825 21825
|
|
thread: 21826 21825
|
|
%
|
|
</pre>
|
|
<p>
|
|
</p><p>
|
|
</p><hr>
|
|
|
|
</body></html>
|